A "RecursiveConstants" attribute
Dan Smith
daniel.smith at oracle.com
Fri Feb 16 23:32:17 UTC 2018
> On Jan 31, 2018, at 2:50 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>
> OK, so the goal is clearly to *prohibit* cyclic constants by enforcing a depth constraint on condy-in-condy nesting. I like this goal much better, and I see how the solution addresses it.
>
> Still, I have to wonder whether its worth the complexity of capturing it explicitly with a new (stackmap-like) attribute, which must be specified, and gives us more chances for the classfile to be out of sync with itself, for the goal of preventing something that can't usefully happen anyway -- as we discussed previously, trying to actually resolve a cyclic constant will likely result in SOE, so such classfiles are pretty much useless and therefore not likely to spread very far.
This is a reasonable critique. Another critique is that the cost of tracking down and parsing an extra attribute will undo a lot of the performance benefit of having it, perhaps even causing a net slowdown.
As we've discussed the idea a little more, my inclination (raised in the meeting this week) is to give up on trying to add structural restrictions on constant pools designed to facilitate cycle detection. It seems like we can, in fact, detect cycles without extra guidance, and with minimal overhead in most cases. Some details below.
Big picture, the most attractive path forward seems to be:
- In 11, a CONSTANT_Dynamic with a cycle is an "error", reported via SOE at resolution time
- When we introduce lazy condy resolution, we move the error detection to class load time, with a check that we're comfortable isn't disrupting startup time
- As other potentially-cyclical constants are introduced, we use the same load time checks
- We never require extra information from the compiler, or reject classes that are hard to prove cycle-free
One nice thing about that last point is that we don't have to worry about incompatible support for "hard" classes in different releases.
I'd love to hear about some experiments with this. We're talking about performance impact, and all discussion to this point has been hypothetical. Would be good to get some concrete feedback.
—Dan
-----
Details:
We started with a rule for format checking (JVMS 4.8) that says it's illegal for a CONSTANT_Dynamic (or, in the future, other potentially-cyclical constants) to refer, via static arguments, to itself—directly or indirectly. But we quickly decided this sounded expensive, and we didn't want a heavy validation step to mess with startup time.
Hence, we explored various ways to prohibit otherwise-valid classes that would be hard to prove cycle-free. Something like the RecursiveConstants attributes seems to be the best option in that space.
However, scrutinizing the original assumption, it seems reasonable that we can optimize the check so that it's extremely lightweight when references are "backward" (to previous entries in the constant pool), and only performs more expensive analysis when references are "forward". Most classes should naturally use backward references exclusively, since entries tend to be generated bottom-up. (In theory. If there are counter examples produced by any mainstream bytecode generators, that would be good to know.)
So a dumb (worst case O(n^2)) but fast (very low constant cost) check could look something like:
void checkCP(int i) {
switch (tag(i)) {
case CONSTANT_Dynamic:
...
for (int j : condyStaticArgs(i)) {
if (j >= i && tag(j) == CONSTANT_Dynamic) {
assertUnreachable(i, j); // can't reach i from j
}
}
...
...
}
}
It's already necessary to check that constant pool references are well-typed, so everything up to 'assertUnreachable' here is practically free*. The 'assertUnreachable' call might be somewhat expensive, but it will rarely be called in practice (and even then, we don't expect many structures to be particularly deep).
(*In the particular case of CONSTANT_Dynamic, there's an indirection through BootstrapMethods. If a class heavily shares BootstrapMethods entries, then there's a new cost here, iterating over the same static arguments multiple times. Some optimization could be done to speed that up. For example, if there are no forward references on the first pass, there will be no forward references on later passes, when i is a larger number.)
A good worst-case test would be a constant pool without cycles, built entirely out of forward references, consisting of lots of deep CONSTANT_Dynamic trees. Needless to say, we don't expect to encounter classes like this in the wild. But, if we do, the big benefit here is that, while they'll be a little slower to load, they'll still work.
More information about the valhalla-spec-observers
mailing list