Inconsistencies in type-checking of exception handlers
Judicaël Courant
judicael.courant at trust-in-soft.com
Fri Sep 16 07:05:35 UTC 2022
Dear Authors of the JVM Specification,
We recently discovered what we believe to be inconsistencies in the
section 4.10.1.6 ("Type Checking Methods with Code") of the
specification:
- These inconsistencies are related to the type-checking of exception
handlers in <init> methods.
- This part of the specification has not been changed from version
8 to version 18.
- The inconsistencies seem quite serious: the formal Prolog
specification does not seem to match the informal one, and neither
of them makes sense.
Could you please:
- Confirm that our understanding of this point is correct? (or tell
us why it is not)
- Tell us what the rules for type-checking exception handlers in
<init> methods were supposed to be?
We attach an appendix to this message describing more precisely the
problems we identified in this part of the specification.
Yours,
Martin Féaux de Lacroix & Judicaël Courant.
Appendix
========
There seems to be inconsistencies related to the type-checking of
exception handlers in <init> methods (in section "4.10.1.6. Type
Checking Methods with Code" of the JVM specification), more precisely
to the paragraph stating:
> An additional requirement exists for a handler inside an <init>
> method if one of the instructions covered by the handler is
> invokespecial of an <init> method. In this case, the fact that a
> handler is running means the object under construction is likely
> broken, so it is important that the handler does not swallow the
> exception and allow the enclosing <init> method to return normally
> to the caller. Accordingly, the handler is required to either
> complete abruptly by throwing an exception to the caller of the
> enclosing <init> method, or to loop forever.
and to its formal counterpart, the definition of predicate
`initHandlerIsLegal(Environment, Handler)`, given in the second Prolog
listing following that paragraph (the listing starts with
`initHandlerIsLegal(Environment, Handler) :-` and contains eight
clauses).
We found six kinds of problems with this definition:
- Syntax Error in the Code
- Missing Code
- Suspicious Logic
- Code Does Not Match Informal Explanation
- Informal Specification Does Not Match Stated Intent
- OpenJDK Does Not Follow the Specification
(The latest one might be a problem in the implementation of the JVM,
but it is suspicious nevertheless.)
These problems are detailed in the following subsections.
Syntax Error in the Code
------------------------
There seems to be a syntax error in the code. Indeed the predicate
`isInitHandler(Environment, Handler)` is defined as follows:
```
isInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method).
member(instruction(_, invokespecial(CP)), Instructions),
CP = method(MethodClassName, '<init>', Descriptor).
```
and the full dot after `isInit(Method)` looks incorrect.
Missing Code
------------
Some code seems to be missing from the clause defining the
`initHandlerIsLegal(Environment, Handler)` predicate. Indeed,
nothing in this clause relates `Target` to the `Handler`:
```
initHandlerIsLegal(Environment, Handler) :-
isInitHandler(Environment, Handler),
sublist(isApplicableInstruction(Target), Instructions,
HandlerInstructions),
noAttemptToReturnNormally(HandlerInstructions).
```
So it seems a subgoal `Handler = handler(_, _, Target, _)` is missing.
Suspicious Logic
----------------
At some places, the logic of the code looks suspicious.
For instance, we would expect the
`noAttemptToReturnNormally(Instructions)` to be true if there is no
attempt to return normally in the list `Instructions` but that is not
the case. Indeed, the definition reads:
```
noAttemptToReturnNormally(Instructions) :-
notMember(instruction(_, return), Instructions).
noAttemptToReturnNormally(Instructions) :-
member(instruction(_, athrow), Instructions).
```
which means that the predicate is true as soon as there is no `return`
instruction in the list `Instructions` OR there is at least one
`athrow` instruction. In other words,
`noAttemptToReturnNormally` returns true when applied to
`[instruction(1, return), instruction(2, athrow)]`, which is somewhat
surprising. A conjunction would seem more appropriate than a
disjunction.
Also, the "additional requirement" applies only
> for a handler inside an <init> method if one of the instructions
> covered by the handler is invokespecial of an <init> method.
And the code introduces a predicate `notInitHandler(Environment,
Handler)` to formalize the negation of this condition. It should
therefore be true when the method we deal with is not an `<init>` one,
or *none* of the instructions covered by the handler is invokespecial
of an `<init>` method.
Unfortunately, the second clause for `notInitHandler` makes it true as
soon as at least *one* of the instructions *not necessarily covered by
the handler* is invokespecial:
```
notInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method),
member(instruction(_, invokespecial(CP)), Instructions),
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '<init>'.
```
Thus you may very well have `notInitHandler(Environment, Handler)`
true and also `isInitHandler(Environment, Handler)` true for the very
same `Environment` and `Handler` (they are both true as soon as
`Instructions` contain at least one `invokespecial` calling an
`<init>` method and at least another one calling a non-`<init>`
method. It is very surprising. (If it is deliberate, an informal
explanation would very much help.)
Code Does Not Match Informal Explanation
----------------------------------------
The code does not seem match the informal explanation describing
it. Indeed, the informal explanation reads
> Accordingly, the handler is required to either complete abruptly by
> throwing an exception to the caller of the enclosing <init> method,
> or to loop forever.
The relevant clause is:
```
initHandlerIsLegal(Environment, Handler) :-
isInitHandler(Environment, Handler),
sublist(isApplicableInstruction(Target), Instructions,
HandlerInstructions),
noAttemptToReturnNormally(HandlerInstructions).
```
Regardless of the missing subgoal linking `Handler` and `Target`, and
regardless of the problem with `noAttemptToReturnNormally`, the only
instructions that are checked for an attempt to return normally are
the ones satisfying the `isApplicableInstruction` predicate, whose
definition reads:
```
isApplicableInstruction(HandlerStart, instruction(Offset, _)) :-
Offset >= HandlerStart.
```
so, the elliptic formulation "the handler" in the informal description
actually refers to the list of instructions starting at the target of
the exception handler in its formal counterpart. There seems to be an
implicit assumption here: a handler would be a list of consecutive
instructions. But what if the handler is somewhat convoluted and
contains a backward jump to an instruction whose offset is *less* than
the target of the exception handler?
In addition, although the informal description of the additional
requirement applies only "if one of the instructions covered by the
handler is invokespecial of an <init> method", in clauses for
`isInitHandler` and `notInitHandler` the whole list `Instructions` is
searched for an `invokespecial` of an `<init>` method, not just the
list of applicable instructions.
Informal Specification Does Not Match Stated Intent
---------------------------------------------------
The informal explanation is unclear (to us at least), and does not
seem to precisely match the diagnostic "the object under construction
is likely broken". If the intent is to state the JVM counterpart of a
Java rule saying that exception thrown while calling `super()` should
not be discarded, the informal explanation should probably state a
condition saying the `invokespecial` of `<init>` method *is applied to
this, if it has not been initialized yet*. But, in order to check
this, do you need more than checking that the `flagThisUninit` is not
set when type-checking the `return` instruction?
OpenJDK Does Not Follow the Specification
-----------------------------------------
Neither the formal code nor the informal explanation seem to match
what the JVM does when checking for this condition: it is possible to
have a code which is incorrect with respect to both the informal
explanation and the formal definition and yet is accepted by the JVM
version 8. For instance, the constructor of the following Java class:
```
public class Foo {
Integer n;
Foo() {
try {
n = new Integer(2 / 0); // invokespecial <init> here
} catch (Exception e) {
//return normally
}
}
}
```
is compiled by `javac` into the following bytecode:
```
{
java.lang.Integer n;
descriptor: Ljava/lang/Integer;
flags:
Foo();
descriptor: ()V
flags:
Code:
stack=5, locals=2, args_size=1
0: aload_0
1: invokespecial #1 // Method
java/lang/Object."<init>":()V
4: aload_0
5: new #2 // class java/lang/Integer
8: dup
9: iconst_2
10: iconst_0
11: idiv
12: invokespecial #3 // Method
java/lang/Integer."<init>":(I)V
15: putfield #4 // Field
n:Ljava/lang/Integer;
18: goto 22
21: astore_1
22: return
Exception table:
from to target type
4 18 21 Class java/lang/Exception
LineNumberTable:
line 3: 0
line 5: 4
line 8: 18
line 6: 21
line 9: 22
LocalVariableTable:
Start Length Slot Name Signature
0 23 0 this LFoo;
StackMapTable: number_of_entries = 2
frame_type = 255 /* full_frame */
offset_delta = 21
locals = [ class Foo ]
stack = [ class java/lang/Exception ]
frame_type = 0 /* same */
}
```
So the `invokespecial` of `<init>`, at address 12, is covered by the
exception handler (ranging from instruction 4 included to 18
excluded). Thus the "additional requirement" should forbid the handler
(starting at address 21) to return normally. Both the informal and
formal specification agree here that the instructions at stake here are
the last two (`a_store_1` and `return`), and that the code should be
rejected. Yet `java` accepts it.
Note: A bit of context
----------------------
At TrustInSoft, we are currently writing a source-to-source compiler
from Java 8 to a C-like language. More exactly, we compile bytecode
classes into a C-like language; we have OpenJDK translate Java source
files into bytecode. Part of this work requires us to type-check the
bytecode.
We try to follow the JVM specification as closely as possible, and we
recently stumbled over two issues which seemed to be errors in the JVM
specification version 8:
- The first issue was actually solved in the JVM specification version
9 (the typing rule given for instruction `putfield` in section
"4.10.1.9. Type Checking Instructions": the case of an uninitialized
object was not covered in version 8 but was added in version 9. We
were quite intrigued that version 8 was not fixed, but we took for
granted that we are supposed to follow version 9.
- The second issue is the one described above.
--
Judicaël Courant
Software Engineer, TrustInSoft
(+33) (0)6 52 42 88 23
https://trust-in-soft.com
More information about the jls-jvms-spec-comments
mailing list