RFR [16/java.xml] 8248348: Regression caused by the update to BCEL 6.0

Stuart Marks stuart.marks at oracle.com
Fri Jun 26 19:46:01 UTC 2020


On 6/25/20 4:53 PM, Joe Wang wrote:
> Please review a fix to a BCEL regression. At issue was the addition of 
> hashCode() method to Instruction.java in BCEL 6.0. This hashCode() method was 
> implemented to return the instruction's opcode that unfortunately is mutable. 
> The problem hasn't showed up until the code path led to calls to remove from a 
> HashSet a target that has been changed since it was added to the HashSet. The 
> proposed patch is to make the hashCode final/immutable.
> 
> This patch implies that a target object is considered the same one even if its 
> field values may have been changed. It therefore may not be appropriate in other 
> situations (or may cause problems). However, since it had always had no 
> hashCode() override before BCEL 6.0, thereby relying on Object's identity hash 
> code, its use case has been limited and time-tested. It can benefit from this 
> patch in that it provides the same function as Object's hash code, and then 
> serves as a reminder (to any who might read the code) how it was used (objects 
> considered to be the same over the course as far as the hashCode is concerned).
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8248348
> webrevs: http://cr.openjdk.java.net/~joehw/jdk16/8248348/webrev/

Hi Joe,

This fix seems vaguely wrong, but in the context of what I understand it's 
doing, it seems that BCEL or the surrounding JAXP code is even more wrong. :-)

 From what I understand, this bugfix was prompted by a case where an Instruction 
was added to a HashSet. The Instruction was then mutated while in the HashSet, 
and this resulted in it not being removed from the HashSet in the future, when 
it was necessary to do so, thus leaving a stale entry. How could this have 
possibly worked?!?

I'm speculating a bit, but I could imagine code like this:

     var inst = new Instruction(ORIG_OPCODE);
     var set = new HashSet<Instruction>();
     set.add(inst);
     ...
     inst.setOpcode(NEW_OPCODE); // (1) mutates inst while in HashSet!
     ...
     set.remove(inst); // (2)

In the current version, the hashCode() is simply the opcode, and equals() 
compares opcodes (and possibly other things). This is correct from the 
standpoint of Instruction itself, as equal items have the same hashcode. 
However, the above code fails because at line (2), the remove() method looks in 
the "wrong" hash bucket and can't find the instruction, so it does nothing.

The patch changes the class to compute the hashcode at construction time, and 
makes the hashcode constant, so it "fixes" the above code, since the remove() 
method will look in the same bucket and find the desired instance. But now 
Instruction breaks the hashCode/equals contract, since it's possible for two 
instances to be equal but to have different hashcodes. (Consider an Instruction 
that was created with NEW_OPCODE. It can be equal to inst, but its hashcode will 
differ.)

It seems to me that this patch is compensating for broken code elsewhere in BCEL 
or JAXP by adding more brokenness. It could be that is ok in this narrow usage 
context. That is, as far as we know, this "shaded" copy of BCEL is used *only* 
by JAXP and not used in general, and it may be that the particular ways that 
JAXP uses BCEL aren't broken (and are indeed fixed) by this change. But I'd like 
to see some assurance of that. It could be that there are bugs introduced by 
this change that simply haven't been uncovered by the testing we've done yet.

Another approach is to figure out what is mutating the Instruction while it's in 
a HashSet and fix that instead.

Unfortunately, either of these approaches seems to involve an arbitrary amount 
of work. :-(

s'marks


More information about the core-libs-dev mailing list