[aarch64-port-dev ] aarch64_get_thread_helper assumptions
Bernhard Urban-Forster
beurba at microsoft.com
Mon Jan 25 19:38:57 UTC 2021
Good question :-)
As a reference, the generated code for aarch64_get_thread_helper on Windows and then on macOS:
COMDAT; sym= "public: static class Thread * __cdecl JavaThread::aarch64_get_thread_helper(void)" (?aarch64_get_thread_help
er at JavaThread@@SAPEAVThread@@XZ)
8 byte align
Execute Read
?aarch64_get_thread_helper at JavaThread@@SAPEAVThread@@XZ (public: static class Thread * __cdecl JavaThread::aarch64_get_thread_helpe
r(void)):
0000000000000000: F81F0FF3 str x19,[sp,#-0x10]!
0000000000000004: A9BF7BFD stp fp,lr,[sp,#-0x10]!
0000000000000008: 910003FD mov fp,sp
000000000000000C: 90000008 adrp x8,_tls_index
0000000000000010: B9400109 ldr w9,[x8,_tls_index]
0000000000000014: F9402E48 ldr x8,[xpr,#0x58]
0000000000000018: F8695913 ldr x19,[x8,w9 uxtw #3]
000000000000001C: 91400269 add x9,x19,__tls_guard,lsl #0xC
0000000000000020: 39400128 ldrb w8,[x9,__tls_guard]
0000000000000024: 35000048 cbnz w8,$LN7
0000000000000028: 94000000 bl __dyn_tls_on_demand_init
$LN7:
000000000000002C: 91400268 add x8,x19,?_thr_current at Thread@@0PEAV1 at EA,lsl #0xC
0000000000000030: F9400100 ldr x0,[x8,?_thr_current at Thread@@0PEAV1 at EA]
0000000000000034: A8C17BFD ldp fp,lr,[sp],#0x10
0000000000000038: F84107F3 ldr x19,[sp],#0x10
000000000000003C: D65F03C0 ret
0000000000026e0c __ZN10JavaThread25aarch64_get_thread_helperEv:
26e0c: fd 7b bf a9 stp x29, x30, [sp, #-16]!
26e10: fd 03 00 91 mov x29, sp
26e14: 00 00 00 90 adrp x0, #0
26e18: 00 00 40 f9 ldr x0, [x0]
26e1c: 08 00 40 f9 ldr x8, [x0]
26e20: 00 01 3f d6 blr x8
26e24: 00 00 40 f9 ldr x0, [x0]
26e28: fd 7b c1 a8 ldp x29, x30, [sp], #16
26e2c: c0 03 5f d6 ret
So what you are suggesting sounds reasonable. Since we don't have that in place on Windows today, I'm surprised we don't have more issues. On hotspot:tier1 I get the same result with your suggested patch as here [1], so please go ahead and integrate it like you suggested.
Thanks for catching,
-Bernhard
[1] https://github.com/openjdk/aarch64-port/pull/12#issuecomment-764779603
________________________________________
From: Anton Kozlov <akozlov at azul.com>
Sent: Sunday, January 24, 2021 17:42
To: Ludovic Henry; Bernhard Urban-Forster
Cc: aph at redhat.com; aarch64-port-dev at openjdk.java.net
Subject: aarch64_get_thread_helper assumptions
Hi, Bernhard, Ludovic,
following Andrew's comment on https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenjdk%2Fjdk%2Fpull%2F2200%23discussion_r563131940&data=04%7C01%7Cbeurba%40microsoft.com%7C24574b1086fc4f09210008d8c0871121%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637471033627646049%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6dbUsuNH56PQEXvM%2BGLVyH25p%2BkYKHjBYeJhfsCNjTM%3D&reserved=0. Could you describe why there is no volatile registers saving on Windows? It's a usual C function that can clobber r0-r17, so shouldn't these be saved?
I would like make a change like below, or do I miss something?
Thanks,
Anton
--- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
@@ -5265,10 +5265,14 @@ void MacroAssembler::char_array_compress(Register src, Register dst, Register le
// by the call to JavaThread::aarch64_get_thread_helper() or, indeed,
// the call setup code.
//
-// aarch64_get_thread_helper() clobbers only r0, r1, and flags.
+// On Linux, aarch64_get_thread_helper() clobbers only r0, r1, and flags.
+// On Windows and macOS, the helper is a usual C function.
//
void MacroAssembler::get_thread(Register dst) {
- RegSet saved_regs = RegSet::range(r0, r1) + BSD_ONLY(RegSet::range(r2, r17)) + lr - dst;
+ RegSet saved_regs =
+ LINUX_ONLY(RegSet::range(r0, r1) + lr - dst)
+ NOT_LINUX (RegSet::range(r0, r17) + lr - dst);
+
push(saved_regs, sp);
mov(lr, CAST_FROM_FN_PTR(address, JavaThread::aarch64_get_thread_helper));
More information about the aarch64-port-dev
mailing list