java.lang.constant.ClassDesc and TypeDescriptor for hidden class??

Mandy Chung mandy.chung at oracle.com
Thu Apr 16 16:33:16 UTC 2020



On 4/16/20 2:34 AM, forax at univ-mlv.fr wrote:
> Hi Mandy,
>
> We have forgotten MethodType.toMethodDescriptorString() [1], it has to 
> be updated too.
>

It's not forgotten (I only didn't list it explicitly in the summary):

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff-descriptor-string/java.base/java/lang/invoke/MethodType.html


> I still disagree with John, it's fine to use the scheme c' has a name 
> for the Hidden Class. I still think the Java methods that says in 
> their spec that they return a descriptor should return a valid 
> descriptor thus throw an exception if there is a hidden class 
> somewhere in the descriptor so a user has to decide how to handle 
> Hidden Class before calling those methods.
> Anyway, we should move on that, the patch is fine for me if the 
> javadoc of toMethodDescriptorString is updated too, so let's record 
> that i disagree and move on.
>

It was updated:
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.06-descriptor-string/src/java.base/share/classes/java/lang/invoke/MethodType.java.sdiff.html

> And for ASM, we have already talked to deprecate any methods that 
> takes a live object in Type (j.l.Class and j.l.r.Method) because it 
> causes surprising classloading loop when ASM is used in a classloader 
> or in an agent, hidden class not having a proper name is another data 
> point in that direction.
>

Thanks.
Mandy

> regards,
> Rémi
>
> [1] 
> https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/lang/invoke/MethodType.html#toMethodDescriptorString()
>
> ------------------------------------------------------------------------
>
>     *De: *"mandy chung" <mandy.chung at oracle.com>
>     *À: *"Peter Levart" <peter.levart at gmail.com>, "Remi Forax"
>     <forax at univ-mlv.fr>
>     *Cc: *"John Rose" <john.r.rose at oracle.com>, "Brian Goetz"
>     <brian.goetz at oracle.com>, "valhalla-dev"
>     <valhalla-dev at openjdk.java.net>
>     *Envoyé: *Mercredi 15 Avril 2020 20:40:48
>     *Objet: *Re: java.lang.constant.ClassDesc and TypeDescriptor for
>     hidden class??
>
>     Hi Peter, Remi,
>
>     Lois and Harold have given further feedback.   They are concerned
>     with option c's impact to JVM implementations due to this new form
>     of signature whose name is outside the "[;"] envelope  whereas
>     signatures are always inside the "[;]" envelopes since day 1.  
>     Option c'  appears to have a higher compatibility risk not only in
>     existing libraries but probably also VM implementations.
>
>     Option c has a low compatibility risk while existing code still
>     needs to be updated to support hidden classes.  As John advices
>     [1], it would be a mistake for `Class::descriptorString` to throw
>     an exception in the long run.  We have exhausted the differences
>     among these options.
>
>     Spec change proposal is:
>     - extend TypeDescriptor for entities that cannot be described
>     nominally.  If it can be described nominally,
>     TypeDescriptor::descriptorString returns a field/method descriptor
>     conforming to JVMS 4.3.  If it cannot be described nominally, the
>     result string is not a type descriptor.  No nominal descriptor can
>     be produced.
>     - specify in the javadoc for Class::descriptorString and
>     MethodType::descriptorString that the result string when it can be
>     described nominally conforming to JVMS 4.3 or when it cannot be
>     described nominally.
>     - No JVMS change
>
>     Here is the updated specdiff and webrev:
>     Specdiff:
>     http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff-descriptor-string/overview-summary.html
>
>     Webrev:
>     http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.06-descriptor-string/
>
>     Please review.
>     Thanks
>     Mandy
>     [1]
>     https://mail.openjdk.java.net/pipermail/valhalla-dev/2020-April/007093.html
>
>     On 4/14/20 2:16 AM, Peter Levart wrote:
>
>
>
>         On 4/13/20 9:09 PM, Mandy Chung wrote:
>
>
>
>             On 4/12/20 5:14 AM, Remi Forax wrote:
>
>                     The problem is not that 'c' is easier to parse,
>                     but that 'c`' is not
>                     parsable at all. Do we really want unparsable
>                     method descriptors?
>
>                     If the problem is preventing resolving of hidden
>                     class names or
>                     descriptors, then it seems that making the method
>                     descriptors unparsable
>                     is not the right place to do that.
>
>                 I agree with Peter,
>                 throwing an exception is better, there is no way to
>                 encode a hidden class in a descriptor because a hidden
>                 class has no name you can lookup,
>                 if the API return an unparsable method descriptor, the
>                 user code will throw an exception anyway.
>
>
>             Several points that are noteworthy:
>
>             1. A resolved method never has a hidden class in its
>             signature as a hidden class cannot be discovered by any
>             class loader.
>
>
>         Right.
>
>
>             2. When VM fails to resolve a symbolic reference to a
>             hidden class, it might print its name or descriptor string
>             in the error message.  Lois and Harold can confirm if this
>             should or should not cause any issue (I can't see how it
>             would cause any issue yet).
>
>             3. The only way to get a method descriptor with a hidden
>             class in it is by constructing `MethodType` with a `Class`
>             object representing a hidden class.
>
>
>         Or by custom code that manipulates class descriptors using
>         String operations. Suppose there's code that doesn't want to
>         eagerly resolve types and just manipulates Strings. Surely a
>         class descriptor of a HC can only be obtained when there *is*
>         a HC already present, but ... never underestimate programmers'
>         imagination when (s)he is combining information from various
>         sources, some of them might be resolvable types, some might be
>         just descriptors, etc...
>
>
>     True.  Our imagination is powerful!
>
>     The main thing is that a bad method descriptor will fail to resolve.
>
>
>             4. `Class::descriptorString` on a hidden class is
>             human-readable but not a valid descriptor (both option c
>             and c')
>
>             5. The special character chosen by option c and c' is an
>             illegal character for an unqualified name ("."  ";" "["
>             "/" see JVMS 4.2.2).  This way loading a class of the name
>             of a hidden class will always get CNFE via bytecode
>             linkage or Class::forName etc (either from Class::getName
>             or mapped from Class::descriptorString).
>
>
>         Right. The JVMS may remain unchanged. But that doesn't mean
>         that Class.descriptorString() couldn't be specified to return
>         a JVMS valid descriptor for classical named types, while for
>         HCs (or derived types like arrays) it would return a special
>         unresolvable descriptor with carefully specified syntax. Such
>         a syntax that would play well when composed into the syntax of
>         higher-level descriptors like method type descriptor. Why
>         would we want that? Because by that we get a more predictable
>         failure mode. We only fail when/if the type described by such
>         descriptor tries to be resolved.
>
>         In this respect, both variants 'c' and 'c`' as you said,
>         violate JVMS spec for valid class descriptor, but 'c' has a
>         more carefully chosen syntax.
>
>
>     'c' keeps the "[;" envelope which is the long-standing format.
>
>     I would say putting the name inside the "[;" envelope may not
>     break existing tools (for example if they never use the name to
>     find the class) whereas putting the suffix following ';' is harder
>     to predict how existing tools are impacted.
>
>
>             For existing tools that map a descriptor string by
>             trimming "L;" envelope and/or replacing "/" with ".",
>             "Lfoo/Foo;/123Z" (option c') may be mapped to "foo.Foo"
>             and ".123Z" (if used ";" as a separator)...
>
>
>         I would say that for existing tools that treat a single class
>         descriptor at once, with option 'c`' they won't treat ';' as a
>         separator between multiple elements. I would say that existing
>         code that tries to trim 'L;' would either:
>
>         - remove the 'L' prefix and strip the string of ';' character
>         wherever it is, which would produce "foo/Foo/123Z" and
>         consequently "foo.Foo.123Z" (a valid binary name)
>         - grok and fail (for example because something that starts
>         with 'L' does not end with ';')
>         - if the code is "hackish" it might blindly trim the last
>         character if the 1st is '[', so we would end up with
>         "foo/Foo;/123" and consequently with "foo.Foo;.123" (not a
>         valid binary name)
>         - something else that neither of us can imagine now
>
>             or "foo.Foo/123Z" which are invalid name whereas
>             "Lfoo/Foo.123Z;" (option c) may have higher chance be
>             mapped to "foo.Foo.123Z" which is a valid binary name.
>
>
>         Right, but neither is 'c`' immune to that interpretation. At
>         least the failure mode of 'c' is more predictable.
>
>
>             ";" and "[" are already used for descriptor.  The
>             remaining ones are "." and "/".
>
>             JDWP and JDI are examples of existing tools that obtain
>             the type descriptor by calling JVM TI `GetClassSignature`
>             and then trims the "L;" envelope and replace "/" with
>             ".".    Option c produces "foo.Foo.123Z" as the resulting
>             string which might make it harder
>
>
>         And what does option 'c`' produce?
>
>         Existing JDK tools could be updated from day 0. Existing 3rd
>         party tools would have to be updated too in either case.
>         Typical failure mode for option 'c' would be that class
>         "foo.Foo.123Z" can't be found. Who knows what kind of failure
>         modes would option 'c`' produce if parsing was done in C for
>         example. Are crashes excluded?
>
>
>             6. Throwing an exception (option a) may make existing
>             libraries to catch issues very early on.  I see the
>             consistency that John made about dual-use APIs that prints
>             a human-readable but not resolvable descriptor. I got
>             convinced that option c and c' have the benefit over
>             option a.
>
>             7. Existing tools or scripts that parse the descriptor
>             string have to be updated in both option c and c' to
>             properly handle hidden classes.  Option c may just hide
>             the problem which is bad if it's left unnoticed but
>             happens in customer environments.
>
>
>         I doubt that the problem would be hidden with option 'c'.
>         Either the code would just work (because it needs not resolve
>         the descriptor of HC) or it would grok on trying to resolve
>         it. In theory the binary name "foo.Foo.123Z" could be resolved
>         into a real class, but that's hardly possible in practice
>         unless you specifically construct such case. And option 'c`'
>         is not immune to that as well. So I don't think that we would
>         suddenly see a bunch of wrong resolvings where "foo.Foo.123Z"
>         would actually be resolved successfully. You have a say in how
>         the suffix in "foo.Foo/suffix" is constructed and by using
>         something that is not a usual name the chances can be minimized.
>
>
>             My only concern is the compatibility risk on existing
>             agents that assume JVM TI `GetClassSignature` returns a
>             valid type descriptor and use it to resolution.   Both
>             option c and c' return an invalid descriptor string and so
>             I consider the impact is about the same.    JDI and JDWP
>             have to be updated to work with either new form. As John
>             noted, option c' has the fail-fast properties that may
>             help existing code to diagnose issues during migration.
>
>             That's my summary why I went with option c'.    The
>             preference is "slightly".
>
>             Any other thought?
>
>
>         I think that it is easier to debug a more predictable failure
>         even if it happens a little later (when resolving the
>         descriptor) than it is to debug an unpredictable (unimagined)
>         failure which supposedly happens a little earlier. In that
>         respect, option 'a' is most predictable, but it might be "to
>         early" (for example, what if some code just wants to log the
>         descriptor). And 'c`' seems a little scary to me, because I
>         can't imagine all the possible failures.
>
>         Regards, Peter
>
>



More information about the valhalla-dev mailing list