java.lang.constant.ClassDesc and TypeDescriptor for hidden class??

Mandy Chung mandy.chung at oracle.com
Wed Apr 8 21:46:41 UTC 2020



On 4/6/20 5:21 PM, John Rose wrote:
> On Apr 2, 2020, at 10:03 PM, John Rose <john.r.rose at oracle.com> wrote:
>> But, I think the best compromise is to admit that, just
>> as Class::getName returns an intentionally invalid
>> though suggestive class name (rather than null or an
>> exception), so Class::descriptorString should return
>> the intentionally invalid though suggestive field
>> descriptor which is regularly derived from said
>> Class::getName.  This provides consistency in
>> string-valued outputs; in the absence of an argument
>> why their treatment should be inconsistent,
>> consistency is less confusing and should win.
> My previous Email was confusing because it gave
> two conflicting answers both incomplete.  Let me
> start again.  (Thanks Mandy for querying me on
> Slack.)
>
> First of all, it should *never* be the case that a
> `ClassDesc` should represent a symbol which is
> not well formed at the descriptor level.  Since
> there is no well formed descriptor for a HC,
> it followed that `Class::describeConstable`
> should never return a non-empty value.
>
> Second of all, associated with the constable
> interface for `Class`, but independent, are three
> API points that produce useful reflective
> representations of type `String`.  There
> is the old `getName` and `getSimpleName`,
> and the new `descriptorString`.
>
> In order to allow HCs to be printed they must
> be named.  Thus, HCs must produce names.
> (And simple-names, for places like the `toString`
> method of `MethodType`.)  These names are
> dual-use, for human reading and for resolution
> by recycling them through `Class::toName` and
> related API points.  The first use does not require
> well-formed class names, and for HC’s (for the first
> time) we *forbid* `toName` to produce a well-formed
> class name.  This ensures that a HC’s name can be
> printed for humans to read, but cannot accidentally
> be resolved.
>
> I suggest that `descriptorString` be *also* regarded
> as a dual-use API point.  This means that (a) a HC’s
> descriptor string should be unsurprising to a human
> reader, and (b) it must be unacceptable as an input
> for a resolving API point.  A corollary of (b) and of
> my earlier point is that a HC’s descriptor string *also*
> must be unacceptable to the factory methods for
> `ClassDesc` and `MethodTypeDesc`.
>
> Let me suggest a specific way to do this.  A HC’s name
> is of the form `N + S`, where N is a valid class name,
> derived from the HC’s classfile, and S is a suffix added
> by the platform.  To prevent confusion with other names,
> the suffix starts with slash ‘/’ and is otherwise a valid
> unqualified name.  In order to create a dual-use
> (human readable but *not* resolvable) descriptor
> for a HC, define it as the valid descriptor for a class
> whose name is `N`, with a suffix `S`.  I think this
> meets all relevant use cases and requirements.
>
> For example, a HC with original name `foo.Bar` and
> suffix `/123` would have a descriptor string of
> `Lfoo/Bar;/123`.  This is simple and unsurprising.

Let's call this option c' as it's a modification to option c.  I want to 
use `123Z` as a suffix example to indicate the significance of the 
location of ';'.

Option c:  `Lfoo/Bar.123Z;`
Option c': `Lfoo/Bar;/123Z`

If someone creates a MethodType with parameter types of a hidden class, 
boolean and int, descriptorString() produces:
     `(Lfoo/Bar;/123ZZI)V`
vs
     `(Lfoo/Bar.123Z;ZI)V`

';' can help splitting the parameter descriptors properly in option c 
that may benefit any error reporting whereas any error message to report 
for option c' will have to include the entire string.

Option c' has the nice property to retain the type descriptor of the 
original bytes by appending a suffix.  On the other hand option c may 
produce a human readable string that can be parsed easily e.g. to 
indicate how many parameter types in this descriptor string.

I'm uncertain of the impact of migration if any tool assumes ';' is the 
end character for a reference type (JVM TI agents may be impacted 
because JVM TI GetClassSignature returns a type descriptor).

What are other benefits of c' over c?

I'm currently leaning toward option c as the result string of a method 
type is easier to parse and one fewer form of names (i.e. type 
descriptor is "L" + CN + ";") where CN is N + "." + S and it's 
`this_class` name of `newBytes` from which the hidden class is derived.

> Yes, it would be even simpler to throw an exception from
> `Class::descriptorString` if the class is a HC.  But I think
> that’s *too simple*, because it makes `descriptorString`
> useless as an input to any class’s `toString` method.
> I think that would be a mistake, in the long run.  String
> producing methods are very useful for user output and
> having them throw (or return null which is about as
> surprising) is a sharp edge for anybody using them for
> user output.  I could be wrong about that, and if the rest
> of y’all are sure I’m wrong about that, go ahead and
> throw an exception.  I’ll reserve the right to say “I told
> you so” when the appropriate time comes.
>
> — John

OK.   I take the first pass on the spec. 
`java.lang.invoke.TypeDescriptor::descriptorString` may produce an 
invalid descriptor string.  This version implements option c but it's 
very easy to switch to option c' (2-line change).

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/8242013/specdiff/overview-summary.html

webrev:
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/8242013/webrev.01/

Class::getName             `foo.Bar/123Z`

Option c:
   Class::descriptorString  `Lfoo/Bar.123Z;`
   MethodType::descriptorString  `(Lfoo/Bar.123Z;ZI)V`

If we choose option c', then:
   Class::descriptorString  `Lfoo/Bar;/123Z`
MethodType::descriptorString  `(Lfoo/Bar;/123ZZI)V`


Mandy



More information about the valhalla-dev mailing list