Bug in attribute writing

Mon May 22 16:43:03 UTC 2023

On Mon, May 22, 2023 at 9:32 AM Brian Goetz <brian.goetz at oracle.com> wrote:
>
>
>
> On 5/22/2023 10:09 AM, - wrote:
> > For writing, I don't think unknown attributes are ever created when
> > they are presented in the first place; so we can assume "support
> > unknown attributes" is always true.
>
> Not quite sure what you're saying here?
>

I mean, the check for whether unknown attributes are enabled is
redundant, as unknown attributes won't be created if it's not enabled,
and if it's provided to the wrong class builder, the current check
always rejects it, too.

>
> Sigh, type annotations are the bane of my existence.  There are really
> no good answers here.
>
> Most attributes contain indexes into the CP, but express little other
> dependency on the structure of the classfile.  As a result, we can use
> "is there a shared CP" as a proxy for "is it OK to write an attribute
> from one classfile to another".  This is true for all the standard
> attributes except the type annotation ones, which contain indexes into
> the code array, exception table, signature attribute ("mth bound of nth
> type parameter"), etc.  Tracking whether a transform would undermine
> these indexes is probably nearly impossible, and even if it were, the
> cost would be orders of magnitude out of line.  So we have a few bad
> choices:
>
>   - silently drop TA attributes on the floor when adapting;
>   - faithfully write presented TA attributes even though they might be
> invalid;
>   - insist the user unpack and repack the attribute.
>
> The "unpack" option is just make-work that won't do much to improve the
> accuracy of the result; much of what's in the TA attributes is of
> interest to static compilers, not bytecode processors.  Silently
> dropping the attribute seems rude.  So the least-bad alternative is to
> just believe the user that this attribute is valid enough for use at
> runtime.
>
> I'd like a fourth option, which maybe involves some sort of classfile
> processing option to give us better guidance, but so far we are not sure
> what is best here.

A fourth option already exists in the Classfile API: we already
process some attributes closely tied to others as an integral unit
instead of separately, such as Code-StackMapTable, constant
pool-BootstrapMethods. Maybe we can try singling out each element of a
type annotation structure and bind it to its respective parent
(exception catch, local variable, signature etc.) during processing.
If a piece of information the annotation is trying to apply to does
not exist, we can simply leave it and not transform or drop it
(configured by classfile options), since it wouldn't be useful in the
beginning.

>
> > I think we can alleviate such dependencies by exposing pointers in the
> > model: for instance, we will use Label instead of int for startPc for
> > LocalVariableTable; the catch table index in type annotation targets
> > can refer to ExceptionCatch. Then, in the writeDirect checks, we
> > propagate checks to the components: if the dependencies of the Labels
> > or ExceptionCatchs are the same (much like how a constant pool is
> > still the same), we will perform direct write; otherwise we perform a
> > clean write.
>
> There are two kinds of references that are virtualized by the library:
> CP indexes and offsets (labels).  The underlying goal is that a
> classfile element from one classfile can be correctly interpreted in
> another.  For CP indexes, if the two classfile share a CP, then CP
> indexes are already good; if not, they contain enough information to be
> copied at write time.  For labels, labels have a meaning specific to the
> containing Code attribute / CodeBuilder, so taht the same label can be
> correctly interpreted in both contexts. We do not virtualize anything else.
>
> We could try to do the same for the exception table; this is worth
> exploring, since we've definitely encountered bugs related to the table
> being reordered before.  But for TA attributes, this eventually runs
> into the absurd, because it contains all sorts of exotic pointers like
> "mth bound of nth type parameter".
>
> Further, virtualization of more quantities starts to exhibit poor
> returns, because we want to be able to minimize the number of
> "unpack/repack" operations when doing a trivial transform.  We can't
> help but do so for labels, but if we had to do so for everything else,
> the costs would likely spiral out of control.

Binding type annotation entries to the application sites would be
another type of virtualization, which I think is more acceptable.
>
> Zooming out, there's a real lesson here: the type annotation attributes
> violated an implicit design principle for attributes, and now everyone
> downstream is paying the cost.  But I am diffident about imposing more
> of these costs on every user of every bytecode-processing framework for
> the sake of incrementally better TA fidelity.
>
> Where we are now is a compromise; do you see a better compromise?

Thinking about custom attributes, I think such a dependency (linking
with indices) would, in fact, be fairly common:
1. Class-files are constant, indice links are reliable;
2. Attribute formats are predetermined, changing preexisting
attributes' format is almost impossible.

We thus might need some sort of attribute-linking features, where one
attribute's updates can be propagated to a downstream attribute. I
would envision that to be defined by the attribute mapper, which
probably has already expanded to become a handler by that point. But
this whole thing looks like an overkill; I am looking for a more
simple solution as well.

> >
> > Chen Liang
> >
> > On Thu, May 18, 2023 at 9:34 AM Brian Goetz <brian.goetz at oracle.com> wrote:
> >> I just noticed that in BoundAttribute.BoundUnknownAttribute:
> >>
> >>          private void checkWriteSupported(Function<ConstantPool, Boolean> condition) {
> >>              if (!condition.apply(classReader))
> >>                  throw new UnsupportedOperationException("Write of unknown attribute " + attributeName() + " not supported to alien constant pool");
> >>          }
> >>
> >> we should at least be checking the option "support unknown attributes" before we say OK.
> >>
> >> Overall, I am not sure we have a fully consistent story for unsupported attributes here.  The basic problem with attributes for which we don't have an AttributeMapper is that we can't be sure about validity.  We don't know where the CP indexes are (though if we are sharing CPs, this is OK), and we don't know what other data might be off.
> >>
> >> (We also have a bigger problem with the type annotations attributes, which are known to have all sorts of non-CP offsets (nth bound of mth type variable, nth exception, indexes into code array, etc) which could easily but thrown off by transformation and there is no practical way to detect whether the original RVTA is still valid.)
>