Proposal: #FixClassFileFormat

forax at univ-mlv.fr forax at univ-mlv.fr
Fri Jul 1 20:20:25 UTC 2016


----- Mail original -----
> De: "mark reinhold" <mark.reinhold at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: jpms-spec-experts at openjdk.java.net
> Envoyé: Vendredi 1 Juillet 2016 19:43:15
> Objet: Re: Proposal: #FixClassFileFormat
> 
> 2016/7/1 0:39:00 -0700, Remi Forax <forax at univ-mlv.fr>:
> > mark.reinhold at oracle.com:
> >> 2016/6/29 1:51:16 -0700, Remi Forax <forax at univ-mlv.fr>:
> >>> I would like to agree on that proposal only on the ground that it
> >>> requires to change the classfile format thus allows in the same time
> >>> to fix the issues of the current classfile spec (encoding of the name
> >>> module as a Class instead of UTF8, wrong flag for requires public),
> >> 
> >> I think that these are completely distinct issues (though your
> >> class-file issues are not yet in the issue list -- would you care
> >> to suggest some text?).
> > 
> > yes, sure.
> > 
> > when implementing the support of module into ASM, i've discovered
> > several issues in the classfile format as defined by [1]
> > 
> > - module name is not encoded as UTF8.
> > 
> >   The spec makes great cares to encode reference to module in
> >   'requires' by example as "UTF8" constant pool entry, a name with no
> >   other constraints, but fail to do the same thing for the name of the
> >   module itself.
> > 
> >   The name of the module is derived from the name of the class which
> >   is encoded as a "Class" constant pool entry. The way to solve that
> >   is to avoid to derive the name of the module from the class, the
> >   name of the class should be module-info, with no package, which
> >   seems logical given that the source file is module-info.java (not
> >   prefixed by a package) and the name of the module should be
> >   encoded at the beginning of the "Module" attribute as a reference to
> >   the constant pool entry of type UTF8. This format can also be more
> >   compact despite adding two bytes in the "Module" attribute in the
> >   fairly common case where one exported package as the same name as
> >   the module name because of the constant poll sharing the names.
> > 
> > - there is what i believe is a typo in the current spec when
> >   specifying the modifier of a required module (requires_flags), the
> >   modifier ACC_PUBLIC is said to have the value 0x0020 instead of
> >   0x0001 in the JVM spec. (0x0020 is either ACC_SUPER or
> >   ACC_SYNCHRONIZED).
> > 
> > - the attributes for specifying the main-class, the module-version, a
> >   concealed module, etc are not specified, and i don't know if there
> >   is a reason to not specifying them.
> 
> On this last point: They should be specified, and will be specified in
> an upcoming revision to the spec.
> 
> Before digging into the other two issues I'd like to try just to record
> them, without any bias toward a particular solution, so I'll add the
> following two items to the issue list:
> 
>   #ClassFileModuleName --- The name of a module is not a simple UTF-8
>   string but is, rather, derived from the `this_class` field of the
>   `ClassFile` structure, which is awkward.
> 
>   #ClassFileRequires --- The `ACC_PUBLIC` constant in a `requires_flags`
>   should be encoded as `0x0001`, as it is elsewhere in the JVMS, rather
>   than as `0x0020`, which has different meanings in other contexts.
> 

yes !

and i've forgotten a third issue, ACC_MODULE is specified as 0x8000,
0x8000 is very special value because it's the only one that left which is available on class, method and field.
Historically, in the first specs, "module" was a modifier like public or private thus using the same value for all the elements was making sense,
but now ACC_MODULE can only applied on class so using a value available on all the elements make less sense, the value 0x8000 should be free in order to be used later.
The possible value for ACC_MODULE are: 0x0008 (ACC_STATIC), 0x0040 (ACC_VOLATILE or ACC_BRIDGE), 0x0080 (ACC_VARARGS or ACC_TRANSIENT), 0x0100 (ACC_NATIVE), 0x0800 (ACC_STRICT). 
So either, 0x0040 or 0x0080 are the best candidate because the other values are only valid either on field or on method but not on both. And now, because 0x0080 is a permutation of 0x8000 in term of characters , i think 0x0040 is the best value for ACC_MODULE.  

Changing the value of ACC_MODULE is interesting if it's done at the same time as the other changes because the compiler and the VM can easily detect and reject the classfile encoded with what will become an old format.


> - Mark
> 

Rémi


More information about the jpms-spec-experts mailing list