java.lang.constant.ClassDesc and TypeDescriptor for hidden class??

Peter Levart peter.levart at gmail.com
Sun Apr 12 10:17:38 UTC 2020


Sorry to bother again, but I just realized something...

What Mandy is proposing as 'c' is probably the best choice. Why? 
Descriptors of the form "Lpac/kage/Name.suffix;" could safely be 
rejected by class loader, because they can't be derived from a valid 
fully qualified class name. When fully qualified class name is converted 
to descriptor, '.' are converted to '/'. Therefore a class can not 
contain '.' in simple name. There's no public API that would allow 
resolving such a class (I think). If you call:

Class.forName("pac.kage.Name.suffix");

It is interpreted as class with simple name "suffix" in package 
"pac.kage.Name". So Mandy's 'c' choice is a good choice. It has a 
bi-directional mapping:

Lpac/kage/Name.suffix; <-> pac.kage.Name/suffix

and doesn't need a sacrifice of a special character (like backslash).

So if class loader is made to reject names that contain '.' in them, 
nothing is sacrificed. Such class loader will still be able to load 
every class present today.

The problem is just that there is code out there that doesn't expect '/' 
in the FQN of the class or '.' in the descriptor and blindly just 
replaces one character with another. But would that really present a 
problem in practice? How often would such code meet a hidden class? JDK 
internal code could be changed to swap the 2 characters and 3rd party 
code would typically not see a hidden class.

Regards, Peter

On 4/12/20 11:26 AM, Peter Levart wrote:
> Just an illustration of why '\' - backslash might be a good choice... 
> I created the following experiment:
>
> package test;
> public class BackslashTest {
>     public static void main(String[] args) throws Exception {
>         Class.forName("test.Name\\suffix");
>     }
> }
>
> ...when this is run by itself, it throws ClassNotFoundException as 
> expected. Then if I do the following in the compiled classes directory:
>
> touch 'test/Name\suffix.class'
>
> ...and re-run the program, I get "ClassFormatError: Truncated class 
> file", again as expected. But I did that on my Linux box. I then 
> packed the classes directory into a .jar file:
>
> META-INF/
> META-INF/MANIFEST.MF
> test/
> test/BackslashTest.class
> test/Name\suffix.class
>
> ... and extracted that jar on the Windows PC. What I get there is the 
> following extracted file structure:
>
>  Directory of C:\Users\peter\tmp\test
>
> 04/12/2020  10:55 AM    <DIR>          .
> 04/12/2020  10:55 AM    <DIR>          ..
> 04/12/2020  10:44 AM               553 BackslashTest.class
> 04/12/2020  11:58 AM    <DIR>          Name
>                1 File(s)            553 bytes
>
>  Directory of C:\Users\peter\tmp\test\Name
>
> 04/12/2020  11:58 AM    <DIR>          .
> 04/12/2020  11:58 AM    <DIR>          ..
> 04/12/2020  10:46 AM                 0 suffix.class
>                1 File(s)              0 bytes
>
>      Total Files Listed:
>                2 File(s)            553 bytes
>                5 Dir(s)   9,659,174,912 bytes free
>
> ...if I run this with:
>
> java -cp . test.BackslashTest
>
> I again get "ClassFormatError: Truncated class file", which exposes a 
> strange implementation detail about mapping "valid" class name to the 
> path where .class file is found. At first this seems just that - 
> implementation detail, but this means that this mapping is not 
> bijective. There could be two class names that would map to the same 
> .class file. What I'm trying to say is that there are characters that 
> are valid in class names, but are unlikely to be chosen in practice by 
> a programming language because they are unfriendly to some practical 
> environment. In this respect `\` is such a character and therefore 
> could be used in the "name" of the hidden class as delimiter between 
> the name as it appears in the .class bytes and suffix. To play safe, 
> classloader should also be made to reject names where '\' appears in 
> the name of a class or package.
>
> WDYT?
>
> Regards, Peter
>
> On 4/12/20 10:28 AM, Peter Levart wrote:
>> Hi,
>>
>> Are '.' and '/' the only characters on the table? The problems with 
>> those are that both are used as hierarchical name delimiters in one 
>> way or another and code exists that converts one to another. If there 
>> was a 3rd character chosen such that:
>>
>> - it would make the descriptor "invalid" for resolving
>> - otherwise not be unexpected in parsing logic
>>
>> ...then such composition would be ideal. I think that there could 
>> event be a place for such names in the specification: syntactically 
>> valid but unresolvable names.
>>
>> The problem is that it is hard to find such a character isn't it? But 
>> let's try...
>>
>> What about character '\' - backslash ? It seems very unlikely that 
>> this character would appear in the name of a class. In Java language 
>> it is not allowed, but even in other languages it is probably treated 
>> as some kind of delimiter and therefore unlikely to be part of the 
>> derived class name. Would it be too much to declare that it must not 
>> appear in the name of the class?
>>
>> Even if it was not forbidden, it would be very unlikely to appear in 
>> practice. By analogy: Java anonymous classes are kind of "hidden", 
>> just on another level - in the language. They do use valid class 
>> names though, even valid in Java language, but there's no problem 
>> with that in practice. Why? Because the outer class typically 
>> controls the "namespace" where they appear in. With java modules, 
>> there is additional control that is imposed on the package namespace.
>>
>> Regards, Peter
>>
>>
>> On 4/12/20 5:35 AM, Mandy Chung wrote:
>>>
>>>
>>> On 4/8/20 3:35 PM, John Rose wrote:
>>>> On Apr 8, 2020, at 3:31 PM, John Rose <john.r.rose at oracle.com 
>>>> <mailto:john.r.rose at oracle.com>> wrote:
>>>>>
>>>>> In both c and c’ there will probably be a cascading failure
>>>>> if the name foo/Bar/123Z or foo/Bar is resolved.  In c’ there
>>>>> is an additional cascading failure when the user that was
>>>>> parsing the signature goes back for more parameters and
>>>>> finds a slash (instead of LZBHCIJFD[).  The thing that tipped
>>>>> me over to c’ is that extra diagnostic: Even though it happens
>>>>> after the user picked up the bad descriptor, it happens closer
>>>>> to the place where the error has its root cause, which is that
>>>>> somebody is trying to parse an (intentionally) illegal descriptor.
>>>>
>>>> P.S. Having the slash+suffix *outside* the L; envelope basically
>>>> rubs any parser’s nose in the fact that there’s something illegal
>>>> here.  Putting it inside the envelope hides the error from the
>>>> parser—which may be a good thing sometimes!  But it means
>>>> that the odd name foo/Bar.123Z will float somewhere else
>>>> and may or may not be misinterpreted.  If it’s handed to
>>>> Class.forName you can bet that the dot will change its meaning.
>>>> On balance, I slightly prefer the fail-fast properties of c’.
>>>
>>> Thanks John.
>>>
>>> I have implemented to define the descriptor string for a hidden 
>>> class of this form:
>>>      "L" + N + ";" + "/" + <suffix>
>>>
>>> Please see [1] for the review thread.
>>>
>>> For your reference, the webrev is:
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.06-delta/ 
>>>
>>>
>>> Specdiff:
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff-inc/overview-summary.html 
>>>
>>>
>>> The spec of `Lookup::defineHiddenClass`, `Class::descriptorString` and
>>> `MethodType::descriptorString` are updated to return the descriptor of
>>> this form for a hidden class.   To support hidden class,
>>> `java.lang.invoke.TypeDescriptor` spec is revised such that a
>>> `TypeDescriptor` object can represent an entity that may not be
>>> described in nominal form.   The serviceability APIs that return a type
>>> descriptor are updated.  This webrev includes a couple other JVM TI and
>>> java.instrument spec clarification w.r.t. hidden classes.
>>>
>>> Mandy
>>> [1] 
>>> https://mail.openjdk.java.net/pipermail/valhalla-dev/2020-April/007121.html
>>
>




More information about the valhalla-dev mailing list