Update on LZMA compression
Chris Hegarty
chris.hegarty at oracle.com
Fri Jul 29 08:56:07 PDT 2011
Hi All,
This mail is a follow up to a previous mail thread I started back in
June [1], to look at LZMA compression for certain parts of jmod
packages. I've played a little more with it since then, and here is a
summary of my findings.
Originally I used the Java implementation from the latest 9.20 SDK [2],
but since then I have modified jpkg to use the native implementations of
LZMA and LZMA2. Initially results were marginally better, but nothing to
write home about. Then I tried archiving the native libraries section
and applying LZMA(2) to it. For this I simply used ZIP with the mode set
to store (no compression). This gives much better compression with
native LZMA(2), marginally better with the Java implementation. There
are more nobs available with the native implementation to increase
compression levels. The difference between LZMA and LZMA2 is miniscule,
with LZMA actually giving marginally getter compressing ( I would have
guessed this since LZMA2 uses subchunks to support mutlithreading). This
is similar to some package distribution on Linux using tar.lzma, or
tar.xz. In fact, test programs have shown that ZIP_LZMA(2) gives around
the same level of compression as tar.lzma, tar.xz for typical native
library content.
I looked at the jmod packages built on all platforms/architectures, they
differ in size, number of classes and native libraries, but using a
combination of ZIP (for archiving) and LZMA (for compression) gives us
very favorable ( compared to original and deb packages ) jmod package
sizes. Here are a few examples, using the two largest packages, jdk.boot
and sun.desktop, and the total size of all the jdk module packages
(jigsaw-pkgs) :
Solaris-i586:
> : ls -la */jdk.boot*
-rw-rw-r-- 1 root 6018 6805823 Jul 28 15:19
jmod-lzma/jdk.boot at 7-ea.jmod
-rw-rw-r-- 1 root 6018 6807999 Jul 28 15:22
jmod-lzma2/jdk.boot at 7-ea.jmod
-rw-rw-r-- 1 root 6018 12159909 Jul 28 15:16 jmod/jdk.boot at 7-ea.jmod
> : ls -la */sun.desktop*
-rw-rw-r-- 1 root 6018 4548927 Jul 28 15:20
jmod-lzma/sun.desktop at 7-ea.jmod
-rw-rw-r-- 1 root 6018 4549255 Jul 28 15:24
jmod-lzma2/sun.desktop at 7-ea.jmod
-rw-rw-r-- 1 root 6018 5731426 Jul 28 15:17 jmod/sun.desktop at 7-ea.jmod
>: du -sk *
26840 jmod
19112 jmod-lzma
19112 jmod-lzma2 << (~29% reduction)
Solaris-sparc:
:> ls -la */jdk.boot*
-rw-r--r-- 1 root java 6715654 Jul 29 16:00
jmod-lzma/jdk.boot at 7-ea.jmod
-rw-r--r-- 1 root java 6709674 Jul 29 16:07
jmod-lzma2/jdk.boot at 7-ea.jmod
-rw-r--r-- 1 root java 12546939 Jul 29 15:52 jmod/jdk.boot at 7-ea.jmod
>: ls -la */sun.desktop*
-rw-r--r-- 1 root java 6228344 Jul 29 16:03
jmod-lzma/sun.desktop at 7-ea.jmod
-rw-r--r-- 1 root java 6230798 Jul 29 16:11
jmod-lzma2/sun.desktop at 7-ea.jmod
-rw-r--r-- 1 root java 8606906 Jul 29 15:56 jmod/sun.desktop at 7-ea.jmod
>: du sk *
62628 jmod
43318 jmod-lzma
43320 jmod-lzma2 << (~30% reduction)
Linux-x64:
> : ls -la */jdk.boot*
-rw-rw-r-- 1 root 6018 4905902 Jul 28 15:08
jmod-lzma/jdk.boot at 7-ea.jmod
-rw-rw-r-- 1 root 6018 4909477 Jul 28 15:11
jmod-lzma2/jdk.boot at 7-ea.jmod
-rw-rw-r-- 1 root 6018 6564630 Jul 28 15:06 jmod/jdk.boot at 7-ea.jmod
> : ls -la */sun.desktop*
-rw-rw-r-- 1 root 6018 4768691 Jul 28 15:10
jmod-lzma/sun.desktop at 7-ea.jmod
-rw-rw-r-- 1 root 6018 4769668 Jul 28 15:13
jmod-lzma2/sun.desktop at 7-ea.jmod
-rw-rw-r-- 1 root 6018 5985300 Jul 28 15:07 jmod/sun.desktop at 7-ea.jmod
> : du -sk *
21593 jmod
17583 jmod-lzma
17584 jmod-lzma2 << (~19% reduction)
I did not use the maximum LZMA compression level when generating these
results. I found during various runs that using the higher levels of
compression gives very little gain and increases the compression time
quite a bit. Instead using a level just one above the default gives very
good compression and reasonable compression time. That said, generating
packages is a one time event and decompressing/installing is more
important. Installation/extraction times of packages using LZMA are
about 1.5 - 2 times longer than that of existing packages.
I don't see compression as 'one size fits all', for my tests I added a
new option to jpkg to enable LZMA. I guess the key here is to find a
good match for the JDK packages and possibly support
creation/extraction/installation of the existing GZIP and LZMA.
It is worth noting that archiving the native libraries within the jmod
package removes the ability to individually extract one, but I don't
think this should be a problem or conflict with the goal of having the
package format streamable. We already do something similar for classes,
a gzipped pack200 archive. Please shout if I mis-interpreted this goal.
-Chris.
[1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2011-June/001332.html
[2] http://www.7-zip.org/sdk.html
More information about the jigsaw-dev
mailing list