Module-file format (DRAFT)

Mark Reinhold mr at sun.com
Fri Jan 15 14:11:06 PST 2010


Here's a first cut at a simple module-file format.

Comments welcome!

- Mark

----

Jigsaw module-file format DRAFT
===============================

Mark Reinhold <mr at sun.com>  
15 January 2010


Goals
-----

  - Optimized for streamed reading, from beginning to end.  Random access
    should not be required when reading a module file, though it can be
    required when writing.

  - Content-specific compression: Pack200+gzip/lzma for classes, bzip2 for
    native code, etc.

  - Independent of any specific installed-module format or target-filesystem
    capability.


Layout
------

A module file has the following sections, in the order listed.

The first two sections are required:

  - Header -- Magic number, format version, sizes
  - module-info.class (never compressed, so as to be easily readable by tools)

These sections are optional:

  - Classes (excluding module-info.class)
  - Resources
  - Native libraries
  - Native launchers
  - Configuration (i.e., properties) files

The final section is required:

  - Secure hash of the entire file (except for this section)


Module file header
------------------

    ModuleFileHeader {
      u4 magic;                 // FileConstants.MAGIC
      u2 type;                  // FileConstants.Type.MODULE_FILE
      u2 major;                 // FileConstants.ModuleFile.MAJOR_VERSION
      u2 minor;                 // FileConstants.ModuleFile.MINOR_VERSION
      u4 csize;                 // Size of entire file, compressed
      u4 usize;                 // Space required for uncompressed contents
                                //   (upper bound; need not be exact)
    }

All integers are in network byte order.


Section headers
---------------

Each section has a header:

    SectionHeader {
      u2 type;                  // One of FileConstants.ModuleFile.SectionType
      u2 compressor;            // One of FileConstants.ModuleFile.Compressor
    }

Compression is only relevant to section content; it never applies to the
headers defined here.

For sections that contain only one entity (i.e., module-info, classes, hash)
the section header is followed by a size pair, which is then followed by the
content:

    SectionSize {
      u4 csize;                 // Size of section content, compressed
      u4 usize;                 // Size of section content, uncompressed
    }

If a section contains named files (i.e., resources, native code, config) then
each file within that section has a header, which is then followed by the
content:

    SectionFileHeader {
      u4 csize;                 // Size of file, compressed
      u4 usize;                 // Size of file, uncompressed
      u2 nameLength;            // Length of file name
      b* name;                  // File name, in UTF-8
    }

File names may include directory separators, though they should not be used for
native-code files.

There is no mode field on files within a section.  The module-installation code
must set the local mode appropriately; e.g., it must make native-code files
executable as appropriate.

A hash section has the content:

    HashSection {
      u2 type;                  // One of FileConstants.ModuleFile.HashType
      u2 size;                  // Never compressed
      b* hash;
    }


Open issues
-----------

  - Signing: Should module files carry their own digital-signature information,
    as jar files do today, or should such information be provided elsewhere?



More information about the jigsaw-dev mailing list