RFR(L): 8218628: Add detailed message to NullPointerException describing what is null.

Tue Mar 26 11:57:45 UTC 2019

Hi Mandy, 

thanks for reading the JEP and giving detailed feedback!

> In the "Basic algorithm to compute the message" section:
> 
> "This dataflow analysis walks the bytecodes forward simulating the execution
> stack."
> 
> Please elaborate it.  Given a stack trace, it starts with the most recently
> executed method and bytecode index, I assume it simulates the method from
> the beginning of the method until it hits the instruction with the matching BCI.
> Please describe it
A data flow analysis is a fixpoint iteration on the code. It is a  
basic concept of compiler construction and used in lots of places in 
the hotspot code. I don't think I should further give an explanation
of this. Maybe I could point to Wikipedia? https://en.wikipedia.org/wiki/Data-flow_analysis

>  and also the cases when it requires to look at its caller frame.
Never.  Only the method that was executed when the exception is 
thrown is needed. I only mention the backtrace because it happens to 
store the top method and the corresponding bytecode position in the 
top method.

I tried to make this more clear by changing the formulation of the first
two paragraphs.

> "Computing either part of these messages can fail due to insufficient
> information."
> What are the cases that it fails to obtain the information? It'd be useful
> to list some examples if not all.
(a ? b : c).d
You can not know whether the ?-expression yielded b or c, thus
you don't know why the access .d failed.  There is a corresponding example 
in [1]. 

> 
> The "Different bytecode variants of a method" section
> 
> This section raises the case when a method representing an execution stack
> frame throwing NPE becomes obsolete, it does not have the information
> to compute the message lazily.  So this falls into the "insufficient
> information" category and no improved NPE message can be generated.
Yes, note this also only addresses the method in which the exception was raised.

> The "computation overhead" section
> 
> I agree with the requirement to have minimal performance overhead to
> the NPE construction cost.
> 
> "NullPointerExceptions are thrown frequently."
> "Fortunately, most Exceptions are discarded without looking at the message."
> 
> This is interesting.  Do you have any statistic on the number of NPE thrown
> and swallowed on certain application/environment that you have gathered?
No, I can try to generate some numbers, though.  But this assumption was made by 
several others that commented on the previous review thread. For Throwable
in general, it is also the assumption why the intermediate backtrace data
structure is implemented instead of generating the stackFrames right away.

> "If the message is printed based on the backtrace, it must be omitted
>  if the top frame was dropped from the backtrace."
> 
> If NPE includes the hidden frames, `Throwable::getStackTrace` and
> `Throwable::toString` and the VM printing of NPE stack trace needs to
> filter the hidden frames.  I think it's appropriate for this JEP to
> cover rather than in a separate JBS issue.
Hmm, I don't completely understand what you want to say, please
excuse, I'm a foreign speaker :). I'll give a comprehensive answer:
To my understanding this is a consequence of the existing JVM/class file
implementation that calls methods that should be hidden to the user.
The implementation drops these frames when the backtrace datastructure
is computed. If this happens, the information needed to compute the 
message is lost, and it can not be printed.
The fact that this happened must be remembered in the VM
so that printing the message can be suppressed. Else a message for
a completely wrong bytecode is printed.
I think the JEP should mention that this is happening.  How this is 
implemented and then pushed to the repo is not subject to a JEP,
is it?
I made a webrev of its own for this problem, so that the code
can be looked at isolated of other, orthogonal issues.
http://cr.openjdk.java.net/~goetz/wr19/8221077-hidden_to_frame/

> The "Message persistance" section
> 
> I read this section like an implementation details
 Yes, I had the feeling that I was talking too much about implementation details, 
but mostly in the "Basic algorithm" section. 
This section discusses characteristics of NPE that differ from other exceptions
and that were proposed in the course of the review of my 
original webrev. They are visible to the Java implementor. 
So I think this is just the basic information needed in a JEP.

> I think this section
> intends to call out that the message of NPE if constructed by user code
> i.e. via public constructor, should not be altered including no-arg
> constructor or pass a null message to 1-arg constructor.  The enhanced
> NPE message is proposed only when NPE is thrown due to null dereference
> when executing bytecode.
Yes.

> I don't understand what you tried to say about "npe.getMessage() ==
> npe.getMessage()".
> Throwable::getMessage does not guarantee to return same String object for
> each invocation.
It does not guarantee so, but it's the case for most exceptions. Usually, 
one calls super(msg) in an Exception constructor.
It is a design decision to implement it that way, my original proposal
was different. I'm fine with this design, but it should be mentioned I think.

> When deserializing the class bytes may be of a different version, so
> StackTraceElements
> are serialized with its string form.  Serializing NPE with a computed message
> seems to be the only viable solution.
I think implementing writeReplace() is a good idea, too. But 
Roger does not like it: http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-February/058513.html

> The "Message content" section
> 
> I think this section can break down into clear scenarios
> 1. debug information is present
> 2. debug information is absent
> 3. the matching version of bytecode is not available
> 
> I agree that the message should be easy for Java developers including the
> beginners to understand the error.
Hmm, only printing local variable names is affected by the debug information.
So I don't think a section of its own is necessary.

If the matching bytecode is not available, nothing can be printed.
So it does not affect the content of the message. Should I call this
section "Message wording"?  I think this is closer to what I want to 
discuss in this paragraph.

> You bring up a good question about the exception message format e.g.
> whether with
> quotation (none vs single quote vs double quote).  It's good to have a guideline
> on that while this is out of scope of this JEP.  IAE was updated to include
> module name, loader name and improve the error message for the developers
> to understand the exception which is a good improvement.
IAE quotes the names, which are user defined strings. It does not quote class names, method names etc.
Here, I was asked to quote these, which is inconsistent with IAE. But there are other 
examples where method names are quoted ... So there is no clear line followed
yet.  In the end, I don't care, but want advice how to do it. (I would not quote ...).

> "Alternatives" section
> 
> "The current proposal is to implement this in the Java runtime in C++ accessing
>  directly the available datastructures in the metaspace."
> 
> "Also, ASM and StackWalker do not support the full functionality needed.
> StackWalker must be called in the NullPointerException constructor."

> This is not true as you wrote in the subsequent sentence.  If we choose
> the implementation to Java, StackWalker will need to be extended to
> read Throwable's backtrace.  
The senctence above describes the status quo. The current implementation 
of StackWalker must be called in the constructor.

> Such specialized form can provide the
> ability to fetch the bytecode in Method* if appropriate.  This is
> the implementation details.  As discussed in the email thread, another
> implementation choice could be a hybrid of native in the VM and Java
> to accomplish this task (for example fetching the bytecode of the
> current version could be done in the VM if we agree supporting
> the redefinition case).
Yes, obviously we can implement a lot of solutions. Instead of changing
StackWalker, we could simply expose [2] java_lang_Throwable::get_method_and_bci()
to Java in NPE.  After all, we only need the method which happens to be stored
as top frame in backtrace.

> For logic that is not strictly needed to be done in the VM, doing it
> in Java and library is a good option to consider (putting aside the
> amount of new code you have to write).   Looks like you can't evaluate
> the alternatives since existing library code requires work in order to
> prototype this in Java.  So I'd say more investigation needs to be done
> to decide on alternative implementation approaches.

> "Testing" section
> 
> "The basic implementation is in use in SAP's internal Java virtual machine since
> 2006."
> 
> This is good information.  This assumes if the same implementation goes into
> JDK.
> So this is subject to the discussion and code review.   I think it's adequate to say
> unit tests will need to be developed.
There is a unit test in the webrev.  Please point out test cases you think
are missing. [1] stems from this test.

>  I think this feature should not impact much of the existing VM code
> and so running existing jtreg tests, jck tests and other existing test suites should
> provide adequate coverage.
> 
> I found [1] is quite useful to understand the scenarios this JEP considers.  I
> think it
> would be useful to include them in the JEP perhaps as examples when
> describing
> the data flow analysis. 
I thought I picked out the most obvious example, pointer chasing, and used it 
in the motivation and the description of the basic algorithm.

>  [1] can be grouped into several categories and the JEP
> can
> just talk about the worthnoting groups and does not need to list all messages
> such as
> dereference a null receive on getfield, null array element vs null array
Yes, I think it makes sense to put more messages into the JEP. But I would 
like to do that once it is agreed on the messages, so I don't need to 
edit the JEP all the time. 
The file referenced is simply generated by the test, so it's easy to update
and sure to reflect the implementation.

> This JEP includes a few links to the C++ functions in the current webrev.  I
> appreciate that and I assume they will be taken out at some point.
Yes, sure, I can remove that.

Best regards,
  Goetz.