[riscv-port] RFR: 8277391: riscv: implement tls function MacroAssembler::get_thread() by using tls.ie

zhengxiaolinX duke at openjdk.java.net
Fri Nov 19 11:19:10 UTC 2021


On Fri, 19 Nov 2021 03:36:00 GMT, Fei Yang <fyang at openjdk.org> wrote:

>> Hi team,
>> 
>> This patch fixes the unimplemented assembly version of `__ get_thread()`, bringing some performance improvement by directly using the x4 register to refer to the variable `Thread::_thr_current`. Tested tests under hotspot/ and jdk/. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
>> 
>> Thanks,
>> Xiaolin
>
> src/hotspot/os_cpu/linux_riscv/threadLS_linux_riscv.S line 31:
> 
>> 29:     .type   _ZN10JavaThread25riscv64_get_thread_helperEv, %function
>> 30: 
>> 31: _ZN10JavaThread25riscv64_get_thread_helperEv:
> 
> For maximum flexibility, I would suggest we switch to use the global-dynamic TLS model here if you don't have a strong reason to use the initial-exec TLS model. That's the default TLS model used by the host GCC compiler and I don't want to affected by the limits of the initial-exec TLS model some day.
> ---------------
>         la.tls.gd       a0, _ZN6Thread12_thr_currentE
>         addi    sp, sp, -16
>         sd      ra, 8(sp)
>         call    __tls_get_addr at plt
>         ld      ra, 8(sp)
>         ld      a0, 0(a0)
>         addi    sp, sp, 16
>         ret
> ---------------

Hi Felix -

Quite sorry for the late reply - it takes me some time to look up some references. This code looks great without any problem and I totally agree with your point - so it is hard to make me not think you have implemented one version already in your codebase :-)  And `__tls_get_addr`'s implementation could found itself at [this link](https://sources.debian.org/src/glibc/2.31-13/sysdeps/riscv/libc-tls.c/) with [macros defined](https://sources.debian.org/src/glibc/2.31-13/sysdeps/riscv/nptl/tls.h/), which is a C function. With this respect, it seems that I also need to save and restore all volatile registers pessimistically in `MacroAssembler::get_thread`, same as the current implementation on the master branch. I think it might be hard to keep the saved registers the same as AArch64's implementation that only clobbers x1 - please correct me if I missed something. But this seems much cheaper than calling a [pthread_getspecific](https://sources.debian.org/src/glibc/2.32-4/nptl/pthread_gets
 pecific.c/?hl=24#L24). In fact, the first version of this patch is written in a similar way as yours - for better performance I chose to use tls.ie - but I think you hold a very reasonable point.

It's Friday night so please feel free to move the discussion to weekdays.

Thanks -
Xiaolin

-------------

PR: https://git.openjdk.java.net/riscv-port/pull/7


More information about the riscv-port-dev mailing list