RFR: 8287325: AArch64: fix virtual threads with -XX:UseBranchProtection=pac-ret [v4]

Hao Sun haosun at openjdk.org
Mon Aug 21 03:45:55 UTC 2023


> ### Background
> 
> 1. PAC-RET branch protection was initially implemented on Linux/AArch64 in JDK-8277204 [1].
> 
> 2. However, it was broken with the introduction of virtual threads [2], mainly because the continuation freeze/thaw mechanism would trigger stack copying to/from memory, whereas the saved and signed LR on the stack doesn't get re-signed accordingly.
> 
> 3. PR-9067 [3] tried to implement the re-sign part, but it was not accepted because option "PreserveFramePointer" is always turned on by PAC-RET but this would slow down virtual threads by ~5-20x.
> 
> 4. As a workaround, JDK-8288023 [4] disables PAC-RET when preview language features are enabled. Note that virtual thread is one preview feature then.
> 
> 5. Virtual thread will become a permanent feature in JDK-21 [5][6].
> 
> ### Goal
> 
> This patch aims to make PAC-RET compatible with virtual threads.
> 
> ### Requirements of virtual threads
> 
> R-1: Option "PreserveFramePointer" should be turned off. That is, PAC-RET implementation should not rely on frame pointer FP. Otherwise, the fast path in stack copying will never be taken.
> 
> R-2: Use some invariant values to stack copying as the modifier, so as to avoid the PAC re-sign for continuation thaw, as the fast path in stack copying doesn't walk the frame.
> 
> Note that more details can be found in the discussion [3].
> 
> ### Investigation
> 
> We considered to use (relative) stack pointer SP, thread ID, PACStack [7] and value zero as the candidate modifier.
> 
> 1. SP: In some scenarios, we need to authenticate the return address in places where the current SP doesn't match the SP on function entry. E.g. see the usage in Runtime1::generate_handle_exception(). Hence, neither absolute nor relative SP works.
> 
> 2. thread ID (tid): It's invariant to virtual thread, but it's nontrivial to access it from the JIT side. We need 1) firstly resolve the address of current thread (See [8] as an example), and 2) get the tid field in the way like java_lang_Thread::thread_id(). I suppose this would introduce big performance overhead. Then can we turn to use "rthread" register (JavaThread object address) as the modifier? Unfortunately, it's not an invariant to virtual threads and PAC re-sign is still needed.
> 
> 5. PACStack uses the signed return address of caller as the modifier to sign the callee's return address. In this way, we get one PACed call chain. The modifier should be saved into somewhere around the frame record. Inevitably, FP should be preserved to make it easy to find this modifier in case of...

Hao Sun has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:

 - Merge branch 'master' into jdk-8287325
 - Remove my temp test patch on jvmci_global.hpp and stubGenerator_aarch64.hpp
 - Use relative SP as the PAC modifier
 - Merge branch 'master' into jdk-8287325
 - Merge branch 'master' into jdk-8287325
 - Rename return_pc_at and patch_pc_at
   
   Rename return_pc_at to return_address_at.
   Rename patch_pc_at to patch_return_address_at.
 - 8287325: AArch64: fix virtual threads with -XX:UseBranchProtection=pac-ret
   
   * Background
   
   1. PAC-RET branch protection was initially implemented on Linux/AArch64
   in JDK-8277204 [1].
   
   2. However, it was broken with the introduction of virtual threads [2],
   mainly because the continuation freeze/thaw mechanism would trigger
   stack copying to/from memory, whereas the saved and signed LR on the
   stack doesn't get re-signed accordingly.
   
   3. PR-9067 [3] tried to implement the re-sign part, but it was not
   accepted because option "PreserveFramePointer" is always turned on by
   PAC-RET but this would slow down virtual threads by ~5-20x.
   
   4. As a workaround, JDK-8288023 [4] disables PAC-RET when preview
   language features are enabled. Note that virtual thread is one preview
   feature then.
   
   5. Virtual thread will become a permanent feature in JDK-21 [5][6].
   
   * Goal
   
   This patch aims to make PAC-RET compatible with virtual threads.
   
   * Requirements of virtual threads
   
   R-1: Option "PreserveFramePointer" should be turned off. That is,
   PAC-RET implementation should not rely on frame pointer FP. Otherwise,
   the fast path in stack copying will never be taken.
   
   R-2: Use some invariant values to stack copying as the modifier, so as
   to avoid the PAC re-sign for continuation thaw, as the fast path in
   stack copying doesn't walk the frame.
   
   Note that more details can be found in the discussion [3].
   
   * Investigation
   
   We considered to use (relative) stack pointer SP, thread ID, PACStack
   [7] and value zero as the candidate modifier.
   
   1. SP: In some scenarios, we need to authenticate the return address in
   places where the current SP doesn't match the SP on function entry. E.g.
   see the usage in Runtime1::generate_handle_exception(). Hence, neither
   absolute nor relative SP works.
   
   2. thread ID (tid): It's invariant to virtual thread, but it's
   nontrivial to access it from the JIT side. We need 1) firstly resolve
   the address of current thread (See [8] as an example), and 2) get the
   tid field in the way like java_lang_Thread::thread_id(). I suppose this
   would introduce big performance overhead.
   
   Then can we turn to use "rthread" register (JavaThread object address)
   as the modifier? Unfortunately, it's not an invariant to virtual threads
   and PAC re-sign is still needed.
   
   3. PACStack uses the signed return address of caller as the modifier to
   sign the callee's return address. In this way, we get one PACed call
   chain. The modifier should be saved into somewhere around the frame
   record. Inevitably, FP should be preserved to make it easy to find this
   modifier in case of some exception scenarios (Recall the reason why we
   fail to use SP as the modifier).
   
   Finally, we choose to use value zero as the modifier. Trivially, it's
   compatible with virtual threads. However, compared to FP modifier, this
   solution would reduce the strength of PAC-RET protection to some extent.
   E.g., you get the same authentication code for each call to the
   function, whereas using FP gives you different codes as long as the
   stack depth is different.
   
   * Implementation of Zero modifier
   
   Here list the key updates of this patch.
   
   1. vm_version_aarch64.cpp
   
   Remove the constraint on "enable-preview" and "PreserveFramePointer".
   
   2. macroAssembler_aarch64.cpp
   
   For utility protect_return_address(), 1) use PACIAZ/PACIZA instructions
   directly. 2) argument "temp_reg" is removed since all functions use the
   same modifier. 3) all the use sites are updated accordingly. This
   involves the updates in many files.
   
   Similar updates are done to utility authenticate_return_address().
   
   Besides, aarch64.ad and AArch64TestAssembler.java are updated
   accordingly.
   
   3. pauth_linux_aarch64.inline.hpp
   
   For utilities pauth_sign_return_address() and
   pauth_authenticate_return_address(), remove the second argument and pass
   value zero to r16 register.
   
   Similarly, all the use sites are updated as well. This involves the
   updates in many files.
   
   4. continuationHelper_aarch64.inline.hpp
   
   Introduce return_pc_at() and patch_pc_at() to avoid directly
   reading the saved PC or writing new signed PC on the stack in shared
   code.
   
   5. Minor updates
   
   1) sharedRuntime_aarch64.cpp: Add the missing
   authenticate_return_address() use for function gen_continuation_enter().
   In functions generate_deopt_blob() and generate_uncommon_trap_blob(),
   remove the authentication on the caller (3) frame since the return
   address is not used.
   
   2) stubGenerator_aarch64.cpp: Add the missing
   authenticate_return_address() use for function generate_cont_thaw().
   
   3) runtime.cpp: enable the authentication.
   
   * Test
   
   1. Cross compilations on arm32/s390/ppc/riscv passed.
   2. zero build and x86 build passed.
   3. tier1~3 passed on Linux/AArch64 w/ and w/o PAC-RET.
   
   Co-Developed-by: Nick Gasson <Nick.Gasson at arm.com>
   
   [1] https://bugs.openjdk.org/browse/JDK-8277204
   [2] https://openjdk.org/jeps/425
   [3] https://github.com/openjdk/jdk/pull/9067
   [4] https://bugs.openjdk.org/browse/JDK-8288023
   [5] https://bugs.openjdk.org/browse/JDK-8301819
   [6] https://openjdk.org/jeps/444
   [7] https://www.usenix.org/conference/usenixsecurity21/presentation/liljestrand
   [8] https://github.com/openjdk/jdk/pull/10441

-------------

Changes: https://git.openjdk.org/jdk/pull/13322/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13322&range=03
  Stats: 309 lines in 30 files changed: 181 ins; 18 del; 110 mod
  Patch: https://git.openjdk.org/jdk/pull/13322.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/13322/head:pull/13322

PR: https://git.openjdk.org/jdk/pull/13322


More information about the hotspot-dev mailing list