Experience report
Brian Goetz
brian.goetz at oracle.com
Thu Aug 4 21:08:19 UTC 2022
Thanks for the experience report! Some comments inline.
On 8/4/2022 4:19 PM, Dan Heidinga wrote:
> Congrats to all involved in building this api! It's been easy to pick
> up and get started with.
Just as streams was the first API we "designed for lambdas", this is the
first API we "designed for pattern matching."
> The examples in the package-summary give a good starting point, though
> I stumbled slightly when switching from purely reading to actually
> transforming classfiles. I hit the "Writing classes" examples first
> and rat-holed a bit there before realizing that I only needed the
> ClassModel::transform call. It might be worth re-ordering the flow to
> put transformation first as in my experience that's more common than
> writing new classfiles.
I think we also need a lot more examples about transformation. The
transformation API may be simple, but there is some subtlety to it and
it needs a lot more examples.
> One of the areas I was modifying was the NEST_HOST and NEST_MEMBERS
> attributes. Adapting an attribute, like adding new members to a
> NEST_MEMBERS attribute, feels unpolished as it requires removing the
> existing attribute and then adding an entirely new one. It would feel
> more straightforward if there was a way to generate a new attribute
> from the old with additions. Some kind of ::withSymbols method maybe?
This use case (and similar: adding an annotation, adding an interface,
etc) was something we struggled with for a long time. I rewrote the
transformation API several times, and each time this case was on my
mind. I am hoping that what is missing here is just the example in the
documentation, rather than an entire new mechanism.
The basic problem is that for any of the attributes that carry "list of
stuff", and you want to add an item, you don't know until the end of the
iteration whether you need to create a new attribute with one thing or
transform the existing attribute. There are a few ways to do it, some
require more hoop jumping than others.
Note that the Transform interfaces, while SAM types, have some extra
methods -- atStart(b) and atEnd(b). These are default methods whose
default implementation does nothing, so that you can have a transform
lambda, but if you want to use either of these methods in your
transform, you need a class (anonymous or otherwise.)
You also need to manage at least one bit of state: is there an element
that carries the list of stuff in the stream, or not? And there are two
ways to manage this: either use the existing model to store your state,
or track the state in your transform object.
The first way is simpler in that you can get someone else (the model) to
do your state management, but feels a little distasteful. What you do is:
- have the accept method always ignore the list-carrier element (e.g.,
NestMembers)
- have the atEnd method query builder().original() to see if the
list-carrier element is present (e.g., findAttribute()); if it is
present, get it and transform it and send it downstream, and if is not
present, generate a new one.
new ClassTransform() {
void accept(ClassBuilder b, ClassElement e) {
switch (e) {
case NestMembersAttribute a -> break;
... other transform stuff ...
}
}
void atEnd(ClassBuilder b) {
Optional<NestMembersAttribute> a =
b.original().get().findAttribute(Attributes.NEST_MEMBERS);
if (a.isPresent()) {
// transform a, send it to builder
}
else {
// make new attribute, send it to builder
}
}
}
This is straightforward enough, but it feels a little disconnected.
The other way involves a _stateful_ transform, where there is state in
the transform class. This feels a little more honest, but it means you
have to use the XxxTransform::ofStateful factory. Here, you transform
elements as you go, and you record whether you saw the NestMembers
attribute, and if you didn't, you emit a fresh one in the atEnd:
() -> new ClassTransform() {
boolean foundNM = false;
void accept(ClassBuilder b, ClassElement e) {
switch (e) {
case NestMembersAttribute a -> { transform a; foundNM =
true; }
... other transform stuff ...
}
}
void atEnd(ClassBuilder b) {
if (!foundNM)
builder.accept(NestMembersAttribute.of(...));
}
}
> I ended up reading the existing members, creating a new attribute from
> the existing & new members, then replacing the entire attribute with
> the new one. It was easy to do with a composed transform (very nice
> by the way!), but my code feels ugly in this case - is there a better
> way to do an update of an attribute like this?
Does the second example above seem better?
> There are several ::of methods on NestMembersAttribute that help but
> it might be worth adding a ::withSymbols(NestMembersAttribute,
> ClassDesc... additions) api. I can create a PR if there's interest in
> these kinds of additions.
Yes, we are interested in usability feedback like this. A PR is a good
place to start. I am interested in ensuring that we apply some
relatively uniform principles across the API about what factories we
provide, though, to avoid whack-a-mole in the future. So if we are
missing some overloads for NestMembersAttribute::of, we're probably also
missing the same in Interfaces and other places too.
> One of the areas I was slightly concerned about was object allocation.
> I haven't measured it but using the APIs - especially mixing
> ClassEntry & j.l.constant.ClassDesc - feels like it generates a lot of
> temporary objects to get a set of entries into a consistent format.
> I needed to convert from one format to the other to be able to create
> new attributes - ie: read the set of ClassEntries and then convert
> them to ClassDesc so they fit with the ClassDesc I already had in
> hand. The ::with* apis additions might help here as well? Or maybe
> there's a better pattern for what I'm trying to do. Guidance
> appreciated.
Clearly we've opted into a higher degree of cost relative to ASM on
this. (On the other hand, we picked up a lot both from laziness and
from constant pool sharing, so some of this cancels out.) The basic
principle is that there are both "close to the format" and "convenience"
representations for both things, such as ClassEntry and ClassDesc. (We
struggled over whether to admit strings at all, and are currently biased
against using strings for anything other than names, because anyone who
has used ASM a lot has made the "com/foo/Bar" vs "com.foo.Bar" vs
"Lcom/foo/Bar;" mistake at least a hundred bajillion times.) But a
ClassDesc is more expensive to make and manipulate than a string.
So the advice is to get things into the most close-to-the-representation
format as early as possible (e.g., ClassEntry). And ClassEntry has both
asSymbol() (a ClassDesc) and asInternalName() (a String), so you don't
have to go through ClassDesc when transforming. Adam is also adding
ClassDesc::fromInternalName as an alternate factory to ClassDesc.
My sense is that we have not quite finished rationalizing
representations when it comes to transforming. If you find yourself
having to convert a lot of stuff to ClassDesc and back when
transforming, either there's a better way to do it, or we should expose
a better way to do it. Hopefully one that doesn't leave users in the
position of guessing "what kind of string is this."
> Composable ClassTransforms are a very nice addition in the API. After
> writing the code that dropped an attribute, and replaced it with an
> updated attribute, I was able to lift that into a composed transform
> that could be reused in several places.
This was the holy grail I was seeking, and it took several iterations of
the API to get there. I'm glad we got there.
More information about the classfile-api-dev
mailing list