RFR(S): 8153892: Handle unsafe access error directly in signal handler instead of going through a stub
Mikael Vidstedt
mikael.vidstedt at oracle.com
Tue Apr 26 18:35:58 UTC 2016
On 4/12/2016 2:15 AM, Thomas Stüfe wrote:
> Hi Mikael, David,
>
> On Tue, Apr 12, 2016 at 2:29 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> On 11/04/2016 10:57 AM, David Holmes wrote:
>
> Hi Mikael,
>
> I think we need to be able to answer the question as to why
> the stubbed
> and stubless forms of this code exist to ensure that
> converting all
> platforms to the same form is appropriate.
>
>
> The more I look at this the more the stubs make no sense :) AIII a
> stub is generated when we need runtime code that may be different
> to that which we could write directly for compiling at build time
> - ie to use CPU specific features of the actual CPU. But I see
> nothing here that suggests any such usage.
>
> So I agree with removing the stubs.
>
> I'm still going through this but my initial reaction is to
> wonder why we
> don't use the same form of handle_unsafe_access on all
> platforms and
> always pass in npc? (That seems to be the only difference in
> code that
> otherwise seems platform independent.)
>
>
> Futher to this and Thomas's comments I think
> handle_unsafe_access(thread, pc, npc) can be defined in shared
> code (where? not sure). Further, if we always pass in npc then we
> don't need to pass in pc as it is unused (seems unused in original
> code too for sparc).
>
>
> I agree. We commonized ucontext_set_pc for all Posix platforms, so we
> can make a common function "handle_unsafe_access(thread, npc)" and
> inside use os::Posix::ucontext_set_pc to modify the context. Then we
> can get rid of the special handling in the signal handlers inside
> os_aix_ppc.cpp and os_linux_ppc.cpp (for both the compiled and the
> interpreted case).
There is definitely room for unification and simplification here. Right
now the signal handling code is, sadly, different on all the different
platforms, despite the fact that in many cases it should be similar or
the exact same. That said, as much as a refactoring/rewrite of the
signal handler code is needed, it will very quickly turn into a much
larger effort...
In this specific case, it would probably make more sense to pass in the
full context to the handle_unsafe_access method and have it do whatever
it feels is necessary to update it. However, a lot of the signal handler
code assumes that a "stub" variable gets set up and only at the end of
the main signal handler function does the actual context get updated.
Changing how that works only for this specific case is obviously not a
good idea, which means it's back to the full scale refactoring and out
of scope for the bug fix.
So to me the fact that the method prototypes differ depending on the
exact platform is just a reflection of how the contexts differ. In lack
of the full context the handler method needs to take whatever parts of
the context is needed to do it's job. I could of course change the
handler method to only take a single "next_pc" argument, but to me that
feels like putting a key part of the logic which handles the unsafe
access (specifically, the part which calculates the next pc) in the
wrong place - IMHO that should really be tightly coupled with the rest
of the logic needed to handle an unsafe access (updating the thread
state etc.), and hence I feel that it really belongs in the
handle_unsafe_access method itself. Happy to hear your thoughts, but I
hope we can agree that the suggested fix, even in its current state, is
still significantly better than what is there now.
Unless somebody has a better suggestion, I'm going to be moving the
implementations of the handle_unsafe_access methods to sharedRuntime
(instead of stubRoutines) and will send out a new webrev shortly.
Cheers,
Mikael
>
>
> BTW I found this comment somewhat unfathomable (both now and in
> original code):
>
> + // pc is the instruction which we must emulate
> + // doing a no-op is fine: return garbage from the load
>
> but finally realized that it means that after the load that raised
> the signal the native code proceeds normally but the value
> apparently loaded is just garbage/arbitrary, and the only sign
> something went wrong is the setting of the pending
> unsafe-access-error bit. This would be a potential source of bugs
> I think, except that when we hit the Java level, we throw the
> exception and so never actually "return" the garbage value. But it
> does mean we would have to be careful if calling the unsafe
> routines from native code.
>
>
> I admit I do not understand fully how the
> _special_runtime_exit_condition flag is processed later, at least not
> for all cases: If I have a java method A using sun.misc.unsafe, which
> gets compiled, the sun.misc.unsafe intrinsic gets inlined into that
> method. So, the whole method A gets marked as "has unsafe access"? So,
> any SIGBUS happening inside this method - which may be larger than the
> inlined sun.misc.unsafe call - will yield an InternalError? And when
> is the flag checked if that method A is called from another java method B?
>
> Sorry if the questions are stupid, I am not a JIT expert, but I try to
> understand how much can happen between the SIGBUS and the
> InternalError getting thrown.
No questions are stupid here. As you may have seen in the other thread,
I filed JDK-8154592[1] to cover making the handling of the faults
synchronous. Hope that helps.
Cheers,
Mikael
[1] https://bugs.openjdk.java.net/browse/JDK-8154592
>
> Thanks, Thomas
>
> Thanks,
> David
>
>
> Thanks,
> David
>
> On 9/04/2016 8:33 AM, Mikael Vidstedt wrote:
>
>
> Please review:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8153892
> Webrev:
> http://cr.openjdk.java.net/~mikael/webrevs/8153892/webrev.01/hotspot/webrev/
> <http://cr.openjdk.java.net/%7Emikael/webrevs/8153892/webrev.01/hotspot/webrev/>
>
>
>
> * Note: this is patch 2 in a set of 3 all aiming to clean
> up and unify
> the unsafe memory getters/setters, along with the handling
> of unsafe
> access errors. The other two issues are:
>
> https://bugs.openjdk.java.net/browse/JDK-8153890 - Handle
> unsafe access
> error as an asynchronous exception
> https://bugs.openjdk.java.net/browse/JDK-8150921 - Update
> Unsafe
> getters/setters to use double-register variants
>
>
> * Summary (copied from the bug description)
>
>
> In certain cases, such as accessing a region of a memory
> mapped file
> which has been truncated on unix-style operating systems,
> a SIGBUS
> signal will be raised and the VM will process it in the
> signal handler.
>
> How the signal is processed differs depending on the
> operating system
> and/or CPU architecture, with two major alternatives:
>
> * "stubless"
>
> Do the necessary thread state updates directly in the
> signal handler,
> and modify the context so that the signal handler returns
> to the place
> where the execution should continue
>
> * Using a stub
>
> Update the context so that when the signal handler returns
> the thread
> will continue execution in a generated stub, which in turn
> will call
> some native code in the VM to update the thread state and
> figure out
> where execution should continue. The stub will then jump
> to that new
> place.
>
>
> It should be noted that the work of updating the thread
> state is very
> small - it's setting a flag or two in the thread
> structure, and figures
> out where the next instruction starts. It should also be
> noted that the
> generated stubs today are broken, because they do not
> preserve all the
> live registers over the call into the VM. There are two
> ways to address
> this:
>
> * Preserve all the necessary registers
>
> This would mean implementing, in macro assembly, the
> necessary logic for
> preserving all the live registers, including, but not
> limited to,
> floating point registers, flag registers, etc. It quickly
> becomes
> obvious that this platform specific and error prone.
>
> * Leverage the fact that the operating system already does
> this as part
> of the signal handling
>
> Do the necessary work in the signal handler instead,
> removing the need
> for the stub alltogether
>
> As mentioned, on some platforms the latter model is
> already in use. It
> is dramatically easier and all platforms should be updated
> to do it the
> same way.
>
>
> * Testing
>
> Just as mentioned in the RFR for JDK-8153890, a new test
> was developed
> to test this code path:
>
> http://cr.openjdk.java.net/~mikael/webrevs/8150921/MappedTruncated.java
> <http://cr.openjdk.java.net/%7Emikael/webrevs/8150921/MappedTruncated.java>
>
> In fact, it was when running this test I found the
> register preservation
> issue. JPRT also passes. Much like JDK-8153890 I wanted to
> get some
> feedback on this before running additional tests.
>
>
> Cheers,
> Mikael
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20160426/9dd6c8c9/attachment-0001.html>
More information about the ppc-aix-port-dev
mailing list