Condy bsm should be idempotent

forax at univ-mlv.fr forax at univ-mlv.fr
Thu Aug 17 23:40:09 UTC 2017


> De: "John Rose" <john.r.rose at oracle.com>
> À: "Brian Goetz" <brian.goetz at oracle.com>
> Cc: "Rémi Forax" <forax at univ-mlv.fr>, "amber-spec-experts"
> <amber-spec-experts at openjdk.java.net>
> Envoyé: Jeudi 17 Août 2017 23:41:15
> Objet: Re: Condy bsm should be idempotent

> On Aug 17, 2017, at 11:41 AM, Brian Goetz < [ mailto:brian.goetz at oracle.com |
> brian.goetz at oracle.com ] > wrote:

>> I agree, and I think this is already implied by the race-arbitrating behavior of
>> CP resolution. If two threads race to resolve the same CP#, the VM will
>> arbitrarily pick a winner, and toss the losing result. Which means that both
>> results must be, in some sense, equivalent. But there's no harm in stating it
>> (just as there's no harm in reminding people that these are supposed to be
>> CONSTANTS.)

> We are on tricky ground here, wanting to say something about
> equivalent expressions yielding equivalent results. (And yes,
> it's like the similar desire to say that of course a condy result
> is, somehow, constant.) There's no good way to enforce these
> constraints, short of inventing a restricted subset of Java
> that can be proven to have the desired properties, and then
> requiring that condy expressions use that subset.

> What we can do is give advice to users of condy on how
> to use it safely. And then surround those good behaviors
> with a spec. which does something reasonably predictable
> and safe even if the users go off the rails (by accident or
> nefarious design).

> There are a lot of ways to win at this, without solving the
> halting problem for full Java or designing a compile-time
> execution mode for Java. (BTW, I'd like to do the latter,
> some day, but for today let's suppose that condy BSMs
> are completely unpredictable in their actions, unless
> their authors take responsibility for them.

> The current position is for the JVM to uphold a very simple
> contract: Each CP entry is distinct (as a contract between
> the classfile author and the JVM) and has independent
> behavior, which is idempotent. The linkage process
> *behind* the CP is not, and cannot be, idempotent,
> which is why we have to record both normal and
> exceptional linkage results.

> Despite the inconvenience for either Remi or ASM users
> (and likewise with Maurizio) I think this is the best way
> to go because it's the simplest for the most delicate part
> of the system, the JVMS. (That's where the attackers
> attack, and where needless complexity is to be avoided.)
> So, I'd prefer to leave the JVMS as it is, and allow bytecode
> generation APIs to cater *only* (or mainly) to well-behaved
> authors who would never dream of writing non-idempotent
> condys.

> To complicate the JVMS in order to regularize the user
> model of ASM would be a mistake. But I don't advocate
> complicating ASM either. Instead, I think it is perfectly
> reasonable to do any of three things in ASM (and other
> tools like it):

> A. Continue normalizing all CP entries, including the
> new ones. This means that a null translation might
> de-duplicate equivalent condy entries. This will
> only hurt people who are creating bad class files
> on purpose, either as negative tests or to explore
> the dark corners of the JVMS behavior. (Remember,
> the bright center requires human responsibility.)

> B. For the new data-type used by ASM to describe
> a condy constant, add a 32-bit "stamp" field which
> participates in that type's equals/hashCode/toString
> methods. This "stamp" field is an arbitrary value
> serving only to differentiate otherwise equivalent
> condy constants. User-built constants default
> their stamp to zero. Constants built during class
> file reading default their stamp to the CP index
> at which they occur. New condy constants are
> interned, old ones are retained distinct. And
> nobody needs to be the wiser, unless they choose
> to look very, very close at the behavior of ASM.

> (B2 Variation: Give the stamp value of zero to
> the every unique condy constant encountered
> in a class file. For the edge case of non-unique
> constants, give them stamps of their CP indexes.
> Other variations are possible. I don't think the
> effort would be well spent, because it requires
> extra stamp-suppressing comparison logic, which
> goes against ASM's minimalist design, and
> may slightly slow ASM's processing of condy.
> Perhaps an optional method could be given
> to find a pre-existing condy item that matches
> a given one? Nobody will use it, I think.)

> C. Say that ASM is free to do either of behaviors
> A (interning) or B (keeping distinct), as a matter
> of implementation. If you need to predict the
> treatment of equivalent condy constants, you
> need to find a workaround: Either don't use
> ASM, or add some salt to the name component
> of the condy's name-and-type, and remove it as a
> post-pass.

> The choice between A/B/C can be adjusted over
> time in response to bugs. Perhaps C is the best
> choice to start with, as a contract, with A as an
> implementation, switching to B or B2 if users
> run into actual problems with duplicate condy's.
> (They probably won't.)

> The JVM must retain the distinction between equivalent
> condy constants at distinct CP indexes. It cannot
> do the interning (in A above) because that's too
> expensive; that's an off-line tool's job. It might specify
> the equivalent of C (threaten to intern), but I think
> that is an empty threat, and could only cause harm
> down the road.

> I'll go even farther: For the JVM, we should specifically
> test that distinct condy constants with equivalent
> structure *can* evaluate to distinct results. The purpose
> of this is not to encourage the use case (although it
> could be used for things like cryptographic nonces)
> but rather as a sort of edge behavior test, to ensure
> that there is no "cross-talk" between constant pool entries.

> — John

I've first implemented something like B2, after a private discussion with John about how to implement stamps, 
i've used the constant pool index as stamp when reading, 0 if you want a shared one, and 65536 if you do not want a shared one. 

I've decided to go to a simpler route (A) after remembering that ASM already has that bug (feature?) of de-duplicating constant pool constants with already existing constants, and very few people complain about that. 
For the JDK tests that requires several structurally equivalent condy, as John said, one can use ASM to generate two slighly different condy (just change the name) and ask Paul, he said to me at the JVM Summit that he secretly wants to become an hexeditor expert :) 

so i agree that the VM should not try to do any interning and resolve each condy once, i still think the spec should, at least in a discussion section, say that the returned value should be constant and the bsm should be idempotent 

regards, 
Rémi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20170818/b847389a/attachment.html>


More information about the amber-spec-experts mailing list