RFR(S) 8209950: SIGBUS in CodeHeapState::print_names()

Thu Aug 30 06:35:23 UTC 2018

Thank you, Vladimir.
I'll go ahead and push the change now. 
Regards,
Lutz

On 30.08.18, 01:11, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:

    Perfect.

    Thanks,
    Vladimir

    On 8/29/18 7:40 AM, Schmidt, Lutz wrote:
    > Hi Vladimir,
    > 
    > how about this one? http://cr.openjdk.java.net/~lucy/webrevs/8209950.01/
    > I have factored out the complex if as suggested.
    > 
    > @Tobias: can you live with this version as well?
    > 
    > Thanks, Lutz
    > 
    > On 28.08.18, 23:29, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    > 
    >      On 8/28/18 1:58 PM, Schmidt, Lutz wrote:
    >      > Hi Vladimir,
    >      >
    >      > I'm not sure if I understand your proposal correctly. So I try to rephrase it according to my understanding. You can then please let me know if I got it right. Here we go:
    >      >
    >      > You suggest to factor out the contents of the complex if statement into a separate function (like static bool nmethod_access_is_safe() or similar). The function could live in class CodeCache, class CompiledMethod, or somewhere similar. What is your favorite place?
    >      >
    >      > If this is your request, I am absolutely in favor of it. If not: please correct me.
    >      
    >      Yes, it is exactly my request. I would put it into CompiledMethod.
    >      
    >      >
    >      > Re performance:
    >      >
    >      > Yes, both code locations are used for tracing/debugging/analysis purposes only. The do not impact production workloads. But: I have seen "safepoint timeouts" in our tests with -XX:+CountCompiledCalls. Just imagine how many (compiled) methods exist after a long run of a large workload with a big code heap. You certainly don't want your trace data to be truncated after investing so much patience and machine time. What I want to say: It's worthwhile considering performance even for debugging code. Not to stringently, but spend a thought.
    >      >
    >      > CodeHeap state analytics can even be requested in production systems. That's what we at SAP originally designed them for, strongly influenced by our support engineers.
    >      Got it.
    >      
    >      Thanks,
    >      Vladimir
    >      
    >      >
    >      > OK, I come to an end now. Sorry for the long text.
    >      >
    >      > Thanks,
    >      > Lutz
    >      >
    >      > On 28.08.18, 21:51, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >      >
    >      >      Hi Lutz,
    >      >
    >      >      Thank you for looking on this. Changes are fine but I think you introduce a new function to fold
    >      >      these set of checks since they become so similar.  I am not shere your concern about
    >      >      os::is_readable_pointer() expense. Both places you fixed are not performance critical.
    >      >
    >      >      Thanks,
    >      >      Vladimir
    >      >
    >      >      On 8/27/18 5:30 AM, Schmidt, Lutz wrote:
    >      >      > Dear all,
    >      >      >
    >      >      > may I please request reviews for my change? It hardens code which iterates over all nmethods in the code cache, e.g. by CodeCache::nmethods_do(). The bug refers to an issue in share/code/codeHeapState.cpp, but the same code exists in share/runtime/sharedRuntime.cpp. SIGBUS/SIGSEGV errors have been observed recently for both locations. Holding the CodeCache_lock while iterating, as asserted by CodeCache::nmethods_do(), does not help.
    >      >      >
    >      >      > The solution is to check yet another pointer for NULL and accessibility before actually using it. Not-NULL but invalid pointers have been observed. That makes accessibility checks necessary. The calls to os::is_readable_pointer() are potentially expensive (if they fail), but the alternative would be a SIGBUS/SIGSEGV.
    >      >      >
    >      >      > This fix extends JDK-8209588 for the sharedRuntime.cpp part. As noted there, it may have impact as far back as jdk9 (for sharedRuntime.cpp).
    >      >      >
    >      >      > Bug:    https://bugs.openjdk.java.net/browse/JDK-8209950
    >      >      > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8209950.00/
    >      >      >
    >      >      > Thank you,
    >      >      > Lutz
    >      >      >
    >      >      >
    >      >
    >      >
    >      
    >