<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<font size="4"><font face="monospace">Some thoughts on refining the
AttributeStability enum. <br>
<br>
Since we don't want to have to support combinations (labels and
CP refs), this enum should be linearly ordered, where STATELESS
< CP < LABELS < HAZMAT < UNKNOWN. I believe the
semantics so far follow this, but we should verify and the docs
should reflect this so people don't ask "what if it has both X
and Y". <br>
<br>
The names of the enum constants need some work :) <br>
<br>
I am wondering about what is the practical difference between
HAZMAT and UNKNOWN. Since you have to ask the attribute mapper
for the stability level, and "unknown attribute" has
historically been "no attribute mapper available", is this
difference worth reflecting? If so, what does it mean? Are we
looking to inflate a synthetic attribute mapper for each
observed unknown attribute, whose functionality is partial? <br>
<br>
Similarly, the AttributesProcessingOption should reflect a
linear progression of pickiness about attributes: <br>
<br>
- pass all attributes<br>
- block unknown, but pass hazmat<br>
- block unknown and hazmat<br>
<br>
I think this is a matter of documenting that this sequence is
monotonic. <br>
<br>
A related question is where we want to do the dropping. We
could drop attributes either on the read side (never even
present it to the user) or the write side. Currently we drop on
read, but I think this may not be ideal. <br>
<br>
Since readers should always be prepared to deal with attributes
they don't recognize, dropping unknown attributes on reading
seems premature. THe only reason to drop an attribute
preemptively is if processing it will trigger unnecessary work.
For attributes on classes, fields, and methods, we do very
little work already; we record the offset, name, and length of
the attribute, and wrap this with an object that only looks at
the bytes if you ask more questions of it. (Given the structure
of the classfile format, its hard to imagine doing less work,
though I think we could be even lazier about inflating a String
for the UTF8 of the name.) <br>
<br>
Where we do a lot of work for attributes is on the attributes in
the Code attribute -- line number tables, local variable tables,
etc. But we have an option for suppressing these separately. <br>
<br>
For HAZMAT attributes such as RVTA, the validity is subtle --
the RVTA on a method can be invalidated if anything in the
method changes (including other method attributes). But if we
don't even explode the method (because the user just passed it
through as a ClassElement), we wouldn't want to explode it just
to remove the RVTA. Which means that the
AttributeProcessingOption is not a guarantee that such
attributes would be dropped, just permission to do so if the
going gets tough. <br>
<br>
So I think what we need to discuss a little further is:<br>
- Better names in the AttributeStability enum <br>
- When we should drop attributes according to the
AttributeProcessingOption<br>
<br>
</font></font><br>
<div class="moz-cite-prefix">On 8/1/2023 5:07 AM, Adam Sotona wrote:<br>
</div>
<blockquote type="cite" cite="mid:CY4PR1001MB2150EFA1A72349311739BFD78C0AA@CY4PR1001MB2150.namprd10.prod.outlook.com">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style>@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:10.0pt;
font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:10.0pt;
font-family:"Calibri",sans-serif;}span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}div.WordSection1
{page:WordSection1;}ol
{margin-bottom:0cm;}ul
{margin-bottom:0cm;}</style>
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">FYI: I’ve created
<a href="https://bugs.openjdk.org/browse/JDK-8313452" moz-do-not-send="true">JDK-8313452</a> and draft <a href="https://github.com/openjdk/jdk/pull/15101" moz-do-not-send="true">
PR 15101</a> “</span><span style="font-size:11.0pt;mso-fareast-language:EN-US">Improve
Classfile API attributes handling safety</span><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">”<b><o:p></o:p></b></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">(the PR is targeted after
<a href="https://bugs.openjdk.org/browse/JDK-8312491" moz-do-not-send="true">JDK-8312491</a> / <a href="https://git.openjdk.org/jdk/pull/14968" moz-do-not-send="true">
PR 14968</a> is integrated due to Javadoc update and
conflicts).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">The custom attribute mapper simplification has
lower priority and can be discussed independently.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Thanks,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Adam<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div id="mail-editor-reference-message-container">
<div>
<div style="border:none;border-top:solid #B5C4DF
1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">Brian
Goetz <a class="moz-txt-link-rfc2396E" href="mailto:brian.goetz@oracle.com"><brian.goetz@oracle.com></a><br>
<b>Date: </b>Monday, 31 July 2023 16:00<br>
<b>To: </b>Adam Sotona
<a class="moz-txt-link-rfc2396E" href="mailto:adam.sotona@oracle.com"><adam.sotona@oracle.com></a>,
<a class="moz-txt-link-abbreviated" href="mailto:classfile-api-dev@openjdk.org">classfile-api-dev@openjdk.org</a>
<a class="moz-txt-link-rfc2396E" href="mailto:classfile-api-dev@openjdk.org"><classfile-api-dev@openjdk.org></a><br>
<b>Subject: </b>Re: Attribute safety<o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span style="font-size:11.0pt"><br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">I like the idea. It makes sense to
simplify handling of custom attributes for some common
situations.</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">As the proposal adds a method to
AtributeMapper identifying “brittle” attributes, it
still implies existence of custom attribute mapper for
each custom attribute.</span><o:p></o:p></p>
</blockquote>
<p class="MsoNormal"><span style="font-size:11.0pt"><br>
Right now, there are two choices for modeling
attributes:<br>
<br>
- No attribute mapper. Here, we will treat it as an
unknown attribute, and use the option for unknown
attribute handling to determine whether to preserve or
drop the attribute.
<br>
<br>
- Attribute mapper present. Here, we currently assume
that if there is an attribute mapper, we can pass the
attribute through uninterpreted during transformation if
the constant pool is shared, and we lift the attribute
to the object form and re-render to bytes it if the
constant pool is not shared. <br>
<br>
We've tried to make it easy to write attribute mappers,
to encourage people to do so. The implicit assumption
in the attribute mapper design currently is that the
only thing that might be environmentally sensitive is
the constant pool. I think this is the assumption we
want to refine. (Secondarily, the explode-and-rewrite
trick can also tolerate labels moving, because labels
are handled through a level of indirection.)<br>
<br>
Thinking some more about how to model this, a single bit
is not good enough. So I propose:<br>
<br>
enum AttributeStability { STATELESS, CP_REFS,
LABELS, HAZMAT }<br>
<br>
(the names here are bad.)<br>
<br>
Where:<br>
<br>
- STATELESS means the attribute contains only pure
data, such as timestamps, and can always be
bulk-copied.
<br>
- CP_REFS means that the attribute contains only pure
data and CP refs, so can be bulk-copied when CP sharing
is in effect, and exploded/rewritten when CP sharing is
not in effect<br>
- LABELS means that the attribute may contain labels,
so should always be exploded/rewritten<br>
- HAZMAT means the attribute may contain indexes into
structured not managed by the library (type variable
lists, etc) and so we consult the "toxic attributes"
option to determine whether to preserve or drop it<br>
<br>
Most JVMS attributes are CP_REF. Some like Deprecated
and CompilationID are STATELESS. The TA attributes are
HAZMAT. The local variable table attributes are
LABELS.
<br>
<br>
So the new API surface is:<br>
<br>
- an enum for the attribute's environmental coupling<br>
- an accessor on AttributeMapper for that enum<br>
- an option for what to do with HAZMAT attributes
(which should probably be merged with the option for
UKNOWN attributes)<br>
<br>
If stateless attributes were common, we might try to
make life easier for attribute mapper writers by making
the read/write methods optional for such attributes, but
they are pretty uncommon so I think this is not worth
it.<br>
<br>
<br>
<br>
<br>
<br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Current attributes can be split into
following categories :</span><o:p></o:p></p>
<ol style="margin-top:0cm" type="1" start="1">
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l2 level1 lfo3"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Self-contained attributes (no
dependency on CP nor Code offsets). Such attributes
can be safely transformed in any situation and their
payload is just copy/pasted.</span><o:p></o:p></li>
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l2 level1 lfo3"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Attributes with references to constant
pool. Such attributes can be safely transformed when
the CP is shared, however require custom handling
(cloning of CP entries) during write into a class
with new CP.</span><o:p></o:p></li>
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l2 level1 lfo3"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Attributes with references to bytecode
offsets (Code attributes). Payload of such
attributes can be safely copy/pasted only when the
Code is untouched. Otherwise they require custom
translation into labeled model during read and back
to offsets during write. These attribute most
probably also use constant pool.</span><o:p></o:p></li>
</ol>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">I would suggest an alternative proposal
to provide various custom attribute mapper factories,
mainly to simplify handling of category #1 and #2 of
custom attributes.</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">That solution would not require to add
any indication methods to the mappers nor global
switches. Each custom mapper (composed by user) will
respond to the actual situation accordingly.</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">For category #1 there might be a single
factory getting attribute name and returning attribute
mapper.</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">For category #2 there might be more
options:</span><o:p></o:p></p>
<ol style="margin-top:0cm" type="1" start="1">
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l1 level1 lfo6"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">A factory producing mapper which throws
on write when CP is not shared</span><o:p></o:p></li>
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l1 level1 lfo6"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Or a factory producing mapper
simplifying CP entries clone and re-mapping on write
when CP is not shared (it might be implemented even
the way the user function identify offsets of CP
indexes inside the payload and mapper does all the
job with CP entries re-mapping).
</span><o:p></o:p></li>
</ol>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">For category #3 we may also provide some
mapper factories, as we will better know specific use
cases.</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Thanks,
</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US">Adam</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-fareast-language:EN-US" lang="EN-US"> </span><o:p></o:p></p>
<div id="mail-editor-reference-message-container">
<div>
<div style="border:none;border-top:solid #B5C4DF
1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">classfile-api-dev
<a href="mailto:classfile-api-dev-retn@openjdk.org" moz-do-not-send="true">
<classfile-api-dev-retn@openjdk.org></a>
on behalf of Brian Goetz <a href="mailto:brian.goetz@oracle.com" moz-do-not-send="true">
<brian.goetz@oracle.com></a><br>
<b>Date: </b>Thursday, 27 July 2023 23:02<br>
<b>To: </b><a href="mailto:classfile-api-dev@openjdk.org" moz-do-not-send="true" class="moz-txt-link-freetext">classfile-api-dev@openjdk.org</a>
<a href="mailto:classfile-api-dev@openjdk.org" moz-do-not-send="true"><classfile-api-dev@openjdk.org></a><br>
<b>Subject: </b>Attribute safety</span><o:p></o:p></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:13.5pt;font-family:"Courier
New",serif">We currently divide attributes
into two buckets: those for which an attribute
mapper exists, and those for which one doesn't.
The latter are represented with
`UnknownAttribute`. There is also an Option to
determine whether unknown attributes should be
discarded when reading or writing a classfile.
The main reason to be cautious about unknown
attributes is that we cannot guarantee their
integrity during transformation if there are any
other changes to the classfile, because we don't
know what their raw contents represent.
<br>
<br>
The library leans heavily on constant pool sharing
to optimize transformation. The default behavior
when transforming a classfile is to keep the
original constant pool as the initial part of the
new constant pool. If constant pool sharing is
enabled in this way, attributes that contain only
pure data and/or constant pool offsets can be
bulk-copied during transformation rather than
parsing and regenerating them.
<br>
<br>
Most of the known attributes meet this criteria --
that they contain only pure data and/or constant
pool offsets. However, there are a cluster of
attributes that are more problematic: the type
annotation attributes. These may contain offsets
into the bytecode table, exception table, list of
type variables, bounds of type variables, and many
other structures that may be perturbed during
transformation. This leaves us with some bad
choices:<br>
<br>
- Try to track if anything the attribute indexes
into has been changed. (The cost and benefit here
are out of balance by multiple orders of magnitude
here.)<br>
- Copy the attribute and hope it is good enough.
Much of the fine structure of RVTA and friends are
not actually used at runtime, so this may be OK.
<br>
- Drop the attribute during transformation and
hope that's OK. <br>
<br>
(There are also middle grounds, such as trying to
detect whether the entity with the attribute
(method, field, etc) has been modified. This is
lighter-weight that trying to track if the
attribute has been invalidated, but this is
already a significant task.)
<br>
<br>
I haven't been happy with any of the options, but
I have a proposal for incrementally improving it:<br>
<br>
- Add a method to AttributeMapper for to indicate
whether or not the attribute contains only pure
data and/or constant pool offsets. (Almost all
the attributes defined in JVMS meet this
restriction; only the type annotation attributes
do not.) For purposes of this mail, call the ones
that do not the "brittle" attributes. <br>
<br>
- Add an option to determine what to do with
brittle attributes under transformation: drop
them, retain them, fail.
<br>
<br>
This way, nonstandard brittle attributes can be
marked as such as well, and get the same treatment
as the known brittle attributes.
<br>
<br>
<br>
</span><o:p></o:p></p>
</div>
</div>
</blockquote>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>