RFR(L): 8218628: Add detailed message to NullPointerException describing what is null.

Wed Feb 20 14:17:39 UTC 2019

with my ASM hat,

----- Mail original -----
> De: "Lindenmaier, Goetz" <goetz.lindenmaier at sap.com>
> À: "Peter Levart" <peter.levart at gmail.com>, "Roger Riggs" <Roger.Riggs at oracle.com>, "core-libs-dev"
> <core-libs-dev at openjdk.java.net>
> Cc: hotspot-runtime-dev at openjdk.java.net
> Envoyé: Mardi 19 Février 2019 17:08:07
> Objet: RE: RFR(L): 8218628: Add detailed message to NullPointerException describing what is null.

> Hi,
> 
> I have been looking at how to implement this in Java using
> StackWalker and ASM.
> Either I oversee something or there is a row of deficiencies with
> these tools to solve my issue.
> 
> StackWalker allows me to collect class name, method name, method
> description and BCI. Given the current implementation, this must
> happen in the NPE constructor which is very inefficient, but
> works. Mandy, should I really change Throwable/StackWalker to
> overcome this? Alternatively, I could JNI-call into the VM
> and get the info when I need it.
> 
> ASM allows me to access the class and the bytecode of the method.
> 
> Unfortunately, ASM has no notion of BCI. It also does not
> externally expose the bytecodes of the methods. It simplifies
> the bytecodes, which is useful to manipulate them, but does
> not allow me to recompute the BCI of instructions, say, iterating
> the InsnList.

yes, it's a feature of ASM, offer the right abstraction to do bytecode generation/transformation i.e. a stream of opcodes that are more abstract than the real bytecode, so there is no way to get a direct access to the bytecode at a specific BCI.

> 
> Further, as I understand, the bytecode I can access in ASM
> is the bytecode as it resides in the class file, not the
> bytecode used by hotspot (bitwise). Hotspot rewrites bytecodes when
> it loads them. And the BCI returned by StackWalker corresponds
> to the bytecode used by hotspot, not that in the class file.
> 
> If a method has been rewritten, there even might be several
> instances of the same method/bytecode, the old still needed for
> running executions and the new, rewritten one. The BCI
> is only correct for one of these.
> 
> So I don't see how I should find the ASM representation of
> the bytecode that caused the NPE.  Do you have any advice?
> 
> Best regards,
>  Goetz.
> 

Rémi

> 
>> -----Original Message-----
>> From: Peter Levart <peter.levart at gmail.com>
>> Sent: Freitag, 15. Februar 2019 18:38
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Roger Riggs
>> <Roger.Riggs at oracle.com>; Java Core Libs <core-libs-dev at openjdk.java.net>
>> Cc: hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: RFR(L): 8218628: Add detailed message to NullPointerException
>> describing what is null.
>> 
>> Hi Goetz,
>> 
>> Just a couple of things if you go that route (of placeholder String):
>> 
>> 
>>    56     private static final String MESSAGE_PLACEHOLDER =
>> "NPE_MESSAGE_PLACEHOLDER";
>> 
>> Here, the initializer of that constant should create a private instance
>> of String:
>> 
>>      private static final String MESSAGE_PLACEHOLDER = new
>> String("NPE_MESSAGE_PLACEHOLDER");
>> 
>> or even (since the content of the placeholder string doesn't matter -
>> the identity does):
>> 
>>      private static final String MESSAGE_PLACEHOLDER = new String();
>> 
>> ...or else anybody calling new
>> NullPointerException("NPE_MESSAGE_PLACEHOLDER") will also be the subject
>> of replacement since the string literals are interned constants. I.e.:
>> 
>> "NPE_MESSAGE_PLACEHOLDER" == "NPE_MESSAGE_PLACEHOLDER"
>> 
>> 
>>    81     public String getMessage() {
>>    82         String message = super.getMessage();
>>    83         if (message == MESSAGE_PLACEHOLDER) {
>>    84             message = getExtendedNPEMessage();
>>    85             setMessage(message);
>>    86         }
>>    87         return super.getMessage();
>>    88     }
>> 
>> Why not simply returning 'message' local variable here?
>> 
>> 
>> Regards, Peter
>> 
>> On 2/14/19 5:35 PM, Lindenmaier, Goetz wrote:
>> > Hi Roger,
>> >
>> >> Maybe 10 years ago, when native was the only solution.
>> >> Now there are tools to analyze bytecode in java.
>> > I'm working on a Java implementation.
>> >
>> >>> Peter Levart proposed to initialize the message with a sentinel instead.
>> >>> I'll implement this as a proposal.
>> >> That's an extra overhead too, any assignment is.
>> > Yes, any code requires some run time. But I think this really is negligible
>> > in comparison to generating the backtrace, which happens on every
>> > exception.  But I'm fine with the 'always recompute, no serialization'
>> > solution. If it turns out to be limited, it still can be changed easily.
>> >
>> >>> I guess we can go without.
>> >>> It would be possible to construct a case where two threads
>> >>> do getMessage() on the same NPE Object but the string is not
>> >>> the same.
>> >> Really!, if the bci is the same, the bytecode is the same, what could be
>> different
>> >> that would change the data used to create the exception?
>> > e.getMessage().equals(e.getMessage()) will hold, but
>> > e.getMessage() != e.getMessage().
>> >
>> > I just implemented the two variants of computing the message, they
>> > basically differ in NullPointerException.java:
>> > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/03-
>> PeterLevart/
>> > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/04-
>> RogerRiggs/
>> >
>> > I also cleaned up the parameters passed as proposed.
>> >
>> >>> We uses this code since 2006, it needed only few changes.  I would like to
>> >>> contribute a follow up for hidden frames of lambdas.
>> >> Valhalla and evolution of the byte code needs to be considered.
>> >> Just because its worked for years does not mean it's the best approach
>> >> today.  Dropping it in now means maintaining it in its current form
>> >> or needing to re-write it later.
>> > Well, yes, that is true. For any change.  For any project.  We have heard
>> > this argument for many of our changes. I don't really think it's a good
>> > argument. As I understand Valhalla is not targeted for jdk13, and
>> > holding up changes for some future projects not really is the process
>> > of OpenJDK, is it?
>> >
>> >>>> The change to the jtreg/vmTestbase/jit/t/t104/t104.gold file raises a
>> >>>> question
>> >>>> about how useful the information is,  developers should not need to know
>> >>>> about "slot 1".
>> >>> Developers should work with class files with debug information and then
>> >>> they would get the name printed. If exceptions are thrown in production
>> >>> code, any information is helpful.  Should I just write "a local"?
>> >> Go back to who you think is going to benefit from it, it seems targeted
>> >> at a fairly novice developer, so exposing low level details may be more
>> >> confusing that just giving a line number and NPE.
>> > Especially on our improved NPE messages we always got good feedback
>> > from the developers within SAP. And there are quite some experienced
>> > people.  Any message requires some background to be helpful.  And I
>> > like to get any information I can have in first place. Attaching the
>> > debugger often is a big overhead. E.g., I look at about 50 failing jtreg
>> > tests every day, I don't want to get each into the debugger every time
>> > in the setup where it was running to see what is wrong ...
>> >
>> > E.g., com/sun/java/swing/plaf/windows/AltFocusIssueTest.java
>> > is failing in a headless environment with an NPE on this line:
>> >      SwingUtilities.invokeAndWait(() -> frame.dispose());
>> > With this change it said "while trying to invoke the method
>> javax.swing.JFrame.dispose(()V) of a null object loaded from static field
>> AltFocusIssueTest.frame" and I figured it's the fact we run headless that
>> > makes the test fail without even looking at the code.
>> >
>> > Best regards,
>> >    Goetz
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > From: Roger Riggs <Roger.Riggs at oracle.com>
>> > Sent: Dienstag, 12. Februar 2019 17:03
>> > To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Java Core Libs <core-
>> libs-dev at openjdk.java.net>
>> > Cc: hotspot-runtime-dev at openjdk.java.net
>> > Subject: Re: RFR(L): 8218628: Add detailed message to NullPointerException
>> describing what is null.
>> >
>> > Hi,
>> > On 02/12/2019 08:14 AM, Lindenmaier, Goetz wrote:
>> > Hi Roger,
>> >
>> > thanks for looking at my change!
>> >
>> > A few higher level issues should be considered, though the details of
>> > the webrev captured my immediate attention.
>> >
>> > Is this the right feature
>> > Yes!!
>> > Maybe, that's what debuggers are for.
>> >
>> >
>> >
>> >   and is this the right level of implementation (C++/native)?
>> > See below.
>> > Maybe 10 years ago, when native was the only solution.
>> > Now there are tools to analyze bytecode in java.
>> >
>> >
>> >
>> >   From a security perspective, adding field names to exceptions exposes
>> > information to callers that they could not otherwise get and that breaks
>> > encapsulation.
>> > That needs to be evaluated.
>> > I can not comment on that. How can I trigger an evaluation of this?
>> > Who needs to evaluate this?
>> >
>> > I think there are ways to make this easier to implement and be easier to
>> > maintain and perform better.
>> >
>> > NullPointerException:
>> > 28: unused import of Field
>> > Removed
>> >
>> > 64: The lazyComputeMessage boolean should be inverted so it does not
>> > require
>> >      initialization, false is the default, anything else adds overhead.
>> >      And it may not be needed.
>> > Peter Levart proposed to initialize the message with a sentinel instead.
>> > I'll implement this as a proposal.
>> > That's an extra overhead too, any assignment is.
>> >
>> >
>> >
>> > 91: Why do you think synchonization is necessary?  It is a performance hit.
>> >      It is highly unlikely to be called from multiple threads and the
>> > value will
>> >      be the same whether it is computed once or multiple times by
>> > different threads.
>> > I guess we can go without.
>> > It would be possible to construct a case where two threads
>> > do getMessage() on the same NPE Object but the string is not
>> > the same.
>> > Really!, if the bci is the same, the bytecode is the same, what could be
>> different
>> > that would change the data used to create the exception?
>> >
>> >
>> >
>> > 99: extendedMessage will never be null (so says the javadoc of
>> > getExtendedNPEMessage)
>> >     Bug: If it does return null, then null is returned from getMessage
>> >     regardless of the value of super.getMessage().
>> > Fixed.
>> >
>> > 121: Simplify getExtendedNPEMessage by making it only responsible for
>> > the detail
>> >     and not concatenation with an existing message.  Put that in
>> > getMessage().
>> > Fixed. You are right, I only call this when the message is NULL.
>> >
>> >     Its not strictly necessary to change the updated message with
>> > setMessage().
>> >     Avoiding that will avoid complexity and a performance hit.
>> >     The message will be computed each time it is needed, and that happens
>> > infrequently.
>> >     (Even in the serialization case, it will be re-computed from the
>> > attached stack frames).
>> > No, you can not recompute it from the stacktrace because that
>> > does not contain the BCI. Also, after deserialization, the bytecode
>> > might look different or not available at all. It depends on the actual
>> > bytecode that has been running when the exception was thrown.
>> > Right, I realized this too when thinking about it over the weekend.
>> >
>> > If a suitable low impact mechanism can't be found, then just
>> > go back to not exposing the extended message and use only the original.
>> > Its a bad idea to twist and contort the local design and accessibility
>> > just for serialization. What remote or delayed client needs to know this?
>> >
>> >
>> >
>> >
>> >     And it avoids adding setMessage() to Throwable, that's significant
>> > change since
>> >     the current implementation only allows the message to be initialized
>> > not updated.
>> >     Immutable is an important characteristic to maintain.
>> > Yes, I don't like I have to set this. But it follows the same pattern
>> > as with the stack trace which is only computed on demand. Thus,
>> > Throwable is not immutable anyways.  Before, I implemented this by a
>> > JNI call hiding this in the Java code.
>> > I proposed implementing setting the field by reflection, Christoph
>> > proposed a shared secret.  David favored the package private setter.
>> > Please advice what is best.
>> > All of them are bad. Private needs to mean private.
>> >
>> > And making it mutable, also means that synchronization or volatile
>> > is needed to ensure a consistent value for getMessage().
>> > That turns into a performance issue for all throwables.
>> > None of the above.
>> >
>> >
>> >
>> >   That makes the writeReplace() unnecessary too.
>> > No. I need this, see above.
>> > See above, but is not essential to the core feature.
>> >
>> >
>> >
>> > Additional command line arguments are an unnecessary complexity,
>> > documentation, and maintenance overhead.  If the feature is as valuable as
>> > you suggest, it should be enabled all the time.
>> > Removed.
>> >
>> > I'm assuming there are no tests broken by the modification of the message.
>> > It is likely that somewhere an application or test looks at the message
>> > itself.
>> > Has that been verified?
>> > Our nightly testing runs the headless jdk and hostspot jtreg tests, jck tests
>> and some
>> > other applications. They are all unaffected by this change after adapting the
>> > message in jtreg/vmTestbase/jit/t/t104/t104.gold.
>> > (win-x86_64, linux: ppc64, ppc64le, s390, x86_64, aarch, aix, solaris-sparc,
>> mac)
>> > Also, I modified quite some messages before which didn't cause any follow-
>> up
>> > breakages to my knowledge.
>> > It does seem unlikely.
>> >
>> >
>> >
>> > I can't speak for the hotspot code, but that seems to add a lot of code
>> > to support
>> > only this convenience feature.  That's a maintenance overhead and burden.
>> > We uses this code since 2006, it needed only few changes.  I would like to
>> > contribute a follow up for hidden frames of lambdas.
>> > Valhalla and evolution of the byte code needs to be considered.
>> > Just because its worked for years does not mean its the best approach
>> > today.  Dropping it in now means maintaining it in its current form
>> > or needing to re-write it later.
>> >
>> >
>> >
>> > The change to the jtreg/vmTestbase/jit/t/t104/t104.gold file raises a
>> > question
>> > about how useful the information is,  developers should not need to know
>> > about "slot 1".
>> > Developers should work with class files with debug information and then
>> > they would get the name printed. If exceptions are thrown in production
>> > code, any information is helpful.  Should I just write "a local"?
>> > Go back to who you think is going to benefit from it, it seems targeted
>> > at a fairly novice developer, so exposing low level details may be more
>> > confusing that just giving a line number and NPE.
>> >
>> >
>> >
>> > The test output of NullPointerExceptionTest shows
>> > "java.lang.Object.hashCode(()I)".
>> > Why is the argument type for Integer shown as "()I";  that's not the
>> > conventional
>> > representation of a primitive int.
>> > I already discussed this with David Holmes:
>> > http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-
>> February/058464.html
>> > Other exceptions use the same format. I don't know of an utility in
>> > hotspot to translate the format to Java source code syntax.  We
>> > implemented such an utility in our internal VM, though, and I would
>> > like to contribute this, too, adapting all the exceptions. I would do this
>> > in a follow-up.
>> > There may be better tools for formatting if it was done in java at a
>> > more appropriate level of abstraction.
>> >
>> >
>> >
>> > Generally, the explanations are too verbose.
>> > Method names and field names would be easier to recognize if they were
>> > quoted, as is done with 'this'.
>> > Fixed, although this is not common in exception messages. See the
>> > messages in
>> http://hg.openjdk.java.net/jdk/jdk/file/2178d300d6b3/test/hotspot/jtreg/runt
>> ime/exceptionMsgs/IllegalAccessError/IllegalAccessErrorTest.java,
>> > line 57ff. Only the String of the name field is quoted, not any entities
>> > declared in the code.
>> > Similar
>> http://hg.openjdk.java.net/jdk/jdk/file/2178d300d6b3/test/hotspot/jtreg/runt
>> ime/LoaderConstraints/vtableAME/Test.java
>> > line 60, also showing the internal signature, see above.
>> >
>> > Argument numbers should be digits,
>> > not English words (first, second, etc) to make them easier to pick out.
>> > Fixed.
>> > And has it limits that do not read easily, e.g. "nr. 9".
>> > Sorry, I don't understand
>> > src/hotspot/share/classfile/bytecodeUtils.cpp:566
>> >
>> test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerE
>> xceptionTest.java:612
>> >
>> > I would not describe 'this' as a local variable.
>> > Fixed.
>> >
>> > Tests, especially new tests should compile without warnings.
>> > NullPointerExceptionTest.java:125 newInstance has been deprecated.
>> > Fixed.
>> >
>> > The messages can be easier to understand, e.g.
>> > 'field a1 is null, trying to invoke a1.hashCode...' instead of:
>> > "while trying to invoke the method java.lang.Object.hashCode(()I) on a
>> > null reference loaded from local variable 'a1'"
>> > The message is built along the structure of the bytecode.
>> > I'll try to change this and then will come up with a new webrev .
>> >
>> > It will easier to understand if it looks more like the code.
>> > BTW, what does this look like for fully optimized code?
>> > You mean whether the bytecode is jitted? It's independent
>> > of how the code is executed, interpreted or compiled.
>> >
>> > Does it bear any resemblance to the source code at all?
>> > Or does it have to be deoptimized to come up with a sensible source
>> > reference.
>> > No, jitted methods must not be deoptimized.
>> >
>> > How much of this can be done in Java code with StackWalker and other
>> > java APIs?
>> > It would be a shame to add this much native code if there was a more robust
>> > way to implement it using APIs with more leverage.
>> > StackWalker operates on the Java stack tracke, which does not contain the
>> BCI
>> > to my knowledge.  Also, I need to read the bytecodes. Are there APIs to
>> access
>> > the bytecode as it resides in the metaspace, rewritten, constants resolved
>> etc?
>> > The code relies on this.
>> >
>> >  From the other thread, beware adding overhead to the constructor.
>> > The lazy computation is essential.
>> > There are a *lot* of NPEs whose messages will never be needed.
>> >
>> > Regards, Roger
>> >
>> >
>> >
>> >
>> > Best regards,
>> >    Goetz.
>> >
>> >
>> >
>> > Regards, Roger
>> >
>> >
>> > On 02/08/2019 09:13 AM, Langer, Christoph wrote:
>> > Hi Goetz,
>> >
>> > ok, so here a new webrev just adding a setter for the field:
>> > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/02
>> >
>> > ... and incorporating the other comments.
>> > Looks better ��
>> >
>> > I have a few additions to
>> > src/java.base/share/classes/java/lang/NullPointerException.java:
>> >
>> > 28 import java.lang.reflect.Field;
>> > 29
>> > -> can be removed now.
>> >
>> > 91         synchronized (this) {
>> > -> I think this is not needed here. The transient modifier for
>> > lazyComputeMessage already implies the lock on the Object, I think.
>> >
>> > 108         return extendedMessage;
>> > -> I guess it would be more correct if you returned super.getMessage() here.
>> >
>> > Thanks
>> > Christoph
>> >