Update on LZMA compression

Chris Hegarty chris.hegarty at oracle.com
Fri Jul 29 08:56:07 PDT 2011


Hi All,

This mail is a follow up to a previous mail thread I started back in 
June [1], to look at LZMA compression for certain parts of jmod 
packages. I've played a little more with it since then, and here is a 
summary of my findings.

Originally I used the Java implementation from the latest 9.20 SDK [2], 
but since then I have modified jpkg to use the native implementations of 
LZMA and LZMA2. Initially results were marginally better, but nothing to 
write home about. Then I tried archiving the native libraries section 
and applying LZMA(2) to it. For this I simply used ZIP with the mode set 
to store (no compression). This gives much better compression with 
native LZMA(2), marginally better with the Java implementation. There 
are more nobs available with the native implementation to increase 
compression levels. The difference between LZMA and LZMA2 is miniscule, 
with LZMA actually giving marginally getter compressing ( I would have 
guessed this since LZMA2 uses subchunks to support mutlithreading). This 
is similar to some package distribution on Linux using tar.lzma, or 
tar.xz. In fact, test programs have shown that ZIP_LZMA(2) gives around 
the same level of compression as tar.lzma, tar.xz for typical native 
library content.

I looked at the jmod packages built on all platforms/architectures, they 
differ in size, number of classes and native libraries, but using a 
combination of ZIP (for archiving) and LZMA (for compression) gives us 
very favorable ( compared to original and deb packages ) jmod package 
sizes. Here are a few examples, using the two largest packages, jdk.boot 
and sun.desktop, and the total size of all the jdk module packages 
(jigsaw-pkgs) :

Solaris-i586:

 > :  ls -la */jdk.boot*
-rw-rw-r--   1 root 6018     6805823 Jul 28 15:19 
jmod-lzma/jdk.boot at 7-ea.jmod
-rw-rw-r--   1 root 6018     6807999 Jul 28 15:22 
jmod-lzma2/jdk.boot at 7-ea.jmod
-rw-rw-r--   1 root 6018     12159909 Jul 28 15:16 jmod/jdk.boot at 7-ea.jmod
 > :  ls -la */sun.desktop*
-rw-rw-r--   1 root 6018     4548927 Jul 28 15:20 
jmod-lzma/sun.desktop at 7-ea.jmod
-rw-rw-r--   1 root 6018     4549255 Jul 28 15:24 
jmod-lzma2/sun.desktop at 7-ea.jmod
-rw-rw-r--   1 root 6018     5731426 Jul 28 15:17 jmod/sun.desktop at 7-ea.jmod
 >: du -sk *
26840   jmod
19112   jmod-lzma
19112   jmod-lzma2 << (~29% reduction)

Solaris-sparc:

:> ls -la */jdk.boot*
-rw-r--r--   1 root java     6715654 Jul 29 16:00 
jmod-lzma/jdk.boot at 7-ea.jmod
-rw-r--r--   1 root java     6709674 Jul 29 16:07 
jmod-lzma2/jdk.boot at 7-ea.jmod
-rw-r--r--   1 root java     12546939 Jul 29 15:52 jmod/jdk.boot at 7-ea.jmod
 >: ls -la */sun.desktop*
-rw-r--r--   1 root java     6228344 Jul 29 16:03 
jmod-lzma/sun.desktop at 7-ea.jmod
-rw-r--r--   1 root java     6230798 Jul 29 16:11 
jmod-lzma2/sun.desktop at 7-ea.jmod
-rw-r--r--   1 root java     8606906 Jul 29 15:56 jmod/sun.desktop at 7-ea.jmod
 >: du sk *
62628   jmod
43318   jmod-lzma
43320   jmod-lzma2 << (~30% reduction)

Linux-x64:

 > : ls -la */jdk.boot*
-rw-rw-r--   1 root 6018     4905902 Jul 28 15:08 
jmod-lzma/jdk.boot at 7-ea.jmod
-rw-rw-r--   1 root 6018     4909477 Jul 28 15:11 
jmod-lzma2/jdk.boot at 7-ea.jmod
-rw-rw-r--   1 root 6018     6564630 Jul 28 15:06 jmod/jdk.boot at 7-ea.jmod
 > :  ls -la */sun.desktop*
-rw-rw-r--   1 root 6018     4768691 Jul 28 15:10 
jmod-lzma/sun.desktop at 7-ea.jmod
-rw-rw-r--   1 root 6018     4769668 Jul 28 15:13 
jmod-lzma2/sun.desktop at 7-ea.jmod
-rw-rw-r--   1 root 6018     5985300 Jul 28 15:07 jmod/sun.desktop at 7-ea.jmod
 > : du -sk *
21593   jmod
17583   jmod-lzma
17584   jmod-lzma2 << (~19% reduction)

I did not use the maximum LZMA compression level when generating these 
results. I found during various runs that using the higher levels of 
compression gives very little gain and increases the compression time 
quite a bit. Instead using a level just one above the default gives very 
good compression and reasonable compression time. That said, generating 
packages is a one time event and decompressing/installing is more 
important. Installation/extraction times of packages using LZMA are 
about 1.5 - 2 times longer than that of existing packages.

I don't see compression as 'one size fits all', for my tests I added a 
new option to jpkg to enable LZMA. I guess the key here is to find a 
good match for the JDK packages and possibly support 
creation/extraction/installation of the existing GZIP and LZMA.

It is worth noting that archiving the native libraries within the jmod 
package removes the ability to individually extract one, but I don't 
think this should be a problem or conflict with the goal of having the 
package format streamable. We already do something similar for classes, 
a gzipped pack200 archive. Please shout if I mis-interpreted this goal.

-Chris.

[1]  http://mail.openjdk.java.net/pipermail/jigsaw-dev/2011-June/001332.html
[2] http://www.7-zip.org/sdk.html



More information about the jigsaw-dev mailing list