[riscv-port] RFR: 8277391: riscv: Implement tls function MacroAssembler::get_thread() using tls.ie
Hi team, This patch fixes the unimplemented assembly version of `__ get_thread()`, bringing some performance improvement by directly using the x4 register to refer to the variable `Thread::_thr_current`. Tested tests under hotspot/ and jdk/. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4) Thanks, Xiaolin ------------- Commit messages: - Implement tls function MacroAssembler::get_thread() using tls.ie Changes: https://git.openjdk.java.net/riscv-port/pull/7/files Webrev: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=7&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8277391 Stats: 60 lines in 3 files changed: 44 ins; 12 del; 4 mod Patch: https://git.openjdk.java.net/riscv-port/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/7/head:pull/7 PR: https://git.openjdk.java.net/riscv-port/pull/7
On Thu, 18 Nov 2021 12:36:31 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
Hi team,
This patch fixes the unimplemented assembly version of `__ get_thread()`, bringing some performance improvement by directly using the x4 register to refer to the variable `Thread::_thr_current`. Tested tests under hotspot/ and jdk/. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
Changes requested by fyang (Lead). src/hotspot/os_cpu/linux_riscv/threadLS_linux_riscv.S line 31:
29: .type _ZN10JavaThread25riscv64_get_thread_helperEv, %function 30: 31: _ZN10JavaThread25riscv64_get_thread_helperEv:
For maximum flexibility, I would suggest we switch to use the global-dynamic TLS model here if you don't have a strong reason to use the initial-exec TLS model. That's the default TLS model used by the host GCC compiler and I don't want to affected by the limits of the initial-exec TLS model some day. --------------- la.tls.gd a0, _ZN6Thread12_thr_currentE addi sp, sp, -16 sd ra, 8(sp) call __tls_get_addr@plt ld ra, 8(sp) ld a0, 0(a0) addi sp, sp, 16 ret --------------- ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
On Fri, 19 Nov 2021 03:36:00 GMT, Fei Yang <fyang@openjdk.org> wrote:
Hi team,
This patch fixes the unimplemented assembly version of `__ get_thread()`, bringing some performance improvement by directly using the x4 register to refer to the variable `Thread::_thr_current`. Tested tests under hotspot/ and jdk/. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
src/hotspot/os_cpu/linux_riscv/threadLS_linux_riscv.S line 31:
29: .type _ZN10JavaThread25riscv64_get_thread_helperEv, %function 30: 31: _ZN10JavaThread25riscv64_get_thread_helperEv:
For maximum flexibility, I would suggest we switch to use the global-dynamic TLS model here if you don't have a strong reason to use the initial-exec TLS model. That's the default TLS model used by the host GCC compiler and I don't want to affected by the limits of the initial-exec TLS model some day. --------------- la.tls.gd a0, _ZN6Thread12_thr_currentE addi sp, sp, -16 sd ra, 8(sp) call __tls_get_addr@plt ld ra, 8(sp) ld a0, 0(a0) addi sp, sp, 16 ret ---------------
Hi Felix - Quite sorry for the late reply - it takes me some time to look up some references. This code looks great without any problem and I totally agree with your point - so it is hard to make me not think you have implemented one version already in your codebase :-) And `__tls_get_addr`'s implementation could found itself at [this link](https://sources.debian.org/src/glibc/2.31-13/sysdeps/riscv/libc-tls.c/) with [macros defined](https://sources.debian.org/src/glibc/2.31-13/sysdeps/riscv/nptl/tls.h/), which is a C function. With this respect, it seems that I also need to save and restore all volatile registers pessimistically in `MacroAssembler::get_thread`, same as the current implementation on the master branch. I think it might be hard to keep the saved registers the same as AArch64's implementation that only clobbers x1 - please correct me if I missed something. But this seems much cheaper than calling a [pthread_getspecific](https://sources.debian.org/src/glibc/2.32-4/nptl/pthread_gets pecific.c/?hl=24#L24). In fact, the first version of this patch is written in a similar way as yours - for better performance I chose to use tls.ie - but I think you hold a very reasonable point. It's Friday night so please feel free to move the discussion to weekdays. Thanks - Xiaolin ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
On Fri, 19 Nov 2021 11:15:40 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
src/hotspot/os_cpu/linux_riscv/threadLS_linux_riscv.S line 31:
29: .type _ZN10JavaThread25riscv64_get_thread_helperEv, %function 30: 31: _ZN10JavaThread25riscv64_get_thread_helperEv:
For maximum flexibility, I would suggest we switch to use the global-dynamic TLS model here if you don't have a strong reason to use the initial-exec TLS model. That's the default TLS model used by the host GCC compiler and I don't want to affected by the limits of the initial-exec TLS model some day. --------------- la.tls.gd a0, _ZN6Thread12_thr_currentE addi sp, sp, -16 sd ra, 8(sp) call __tls_get_addr@plt ld ra, 8(sp) ld a0, 0(a0) addi sp, sp, 16 ret ---------------
Hi Felix -
Quite sorry for the late reply - it takes me some time to look up some references. This code looks great without any problem and I totally agree with your point - so it is hard to make me not think you have implemented one version already in your codebase :-) And `__tls_get_addr`'s implementation could found itself at [this link](https://sources.debian.org/src/glibc/2.31-13/sysdeps/riscv/libc-tls.c/) with [macros defined](https://sources.debian.org/src/glibc/2.31-13/sysdeps/riscv/nptl/tls.h/), which is a C function. With this respect, it seems that I also need to save and restore all volatile registers pessimistically in `MacroAssembler::get_thread`, same as the current implementation on the master branch. I think it might be hard to keep the saved registers the same as AArch64's implementation that only clobbers x1 - please correct me if I missed something. But this seems much cheaper than calling a [pthread_getspecific](https://sources.debian.org/src/glibc/2.32-4/nptl/pthread_ge tspecific.c/?hl=24#L24). In fact, the first version of this patch is written in a similar way as yours - for better performance I chose to use tls.ie - but I think you hold a very reasonable point.
It's Friday night so please feel free to move the discussion to weekdays.
Thanks - Xiaolin
Seems `get_thread()` is only used in debug code on AArch64 and RISCV -- I have pushed revised code and now hotspot and jdk tier1 are under testing. I will ping this thread when tests are done. ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
On Mon, 22 Nov 2021 02:40:18 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
Hi Felix -
Quite sorry for the late reply - it takes me some time to look up some references. This code looks great without any problem and I totally agree with your point - so it is hard to make me not think you have implemented one version already in your codebase :-) And `__tls_get_addr`'s implementation could found itself at [this link](https://sources.debian.org/src/glibc/2.31-13/sysdeps/riscv/libc-tls.c/) with [macros defined](https://sources.debian.org/src/glibc/2.31-13/sysdeps/riscv/nptl/tls.h/), which is a C function. With this respect, it seems that I also need to save and restore all volatile registers pessimistically in `MacroAssembler::get_thread`, same as the current implementation on the master branch. I think it might be hard to keep the saved registers the same as AArch64's implementation that only clobbers x1 - please correct me if I missed something. But this seems much cheaper than calling a [pthread_getspecific](https://sources.debian.org/src/glibc/2.32-4/nptl/pthread_g etspecific.c/?hl=24#L24). In fact, the first version of this patch is written in a similar way as yours - for better performance I chose to use tls.ie - but I think you hold a very reasonable point.
It's Friday night so please feel free to move the discussion to weekdays.
Thanks - Xiaolin
Seems `get_thread()` is only used in debug code on AArch64 and RISCV -- I have pushed revised code and now hotspot and jdk tier1 are under testing. I will ping this thread when tests are done.
Updated changes looks good. Let's see when you have the test result. ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
Hi team,
This patch fixes the unimplemented assembly version of `__ get_thread()`, bringing some performance improvement by directly using the x4 register to refer to the variable `Thread::_thr_current`. Tested tests under hotspot/ and jdk/. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
zhengxiaolinX has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: Implement tls function MacroAssembler::get_thread() using tls.gd ------------- Changes: https://git.openjdk.java.net/riscv-port/pull/7/files Webrev: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=7&range=01 Stats: 58 lines in 3 files changed: 45 ins; 9 del; 4 mod Patch: https://git.openjdk.java.net/riscv-port/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/7/head:pull/7 PR: https://git.openjdk.java.net/riscv-port/pull/7
On Mon, 22 Nov 2021 02:46:03 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
Hi team,
This patch fixes the unimplemented assembly version of `__ get_thread()`, bringing some performance improvement by directly using the x4 register to refer to the variable `Thread::_thr_current`. Tested tests under hotspot/ and jdk/. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
zhengxiaolinX has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit:
Implement tls function MacroAssembler::get_thread() using tls.gd
After some reconsideration, I think we can go one step further: simply keep the changes in file src/hotspot/share/utilities/globalDefinitions_gcc.hpp and remote all other changes in this PR. Then USE_LIBRARY_BASED_TLS_ONLY macro won't be defined and Thread::current() will return '_ZN6Thread12_thr_currentE' in that case. This will be the same in functionality as the newly add assembly function '_ZN10JavaThread25riscv64_get_thread_helperEv'. I have checked with Yadong, the reason for defining USE_LIBRARY_BASED_TLS_ONLY macro is to workaround some host compiler bug when accessing TLS variables in the very early days. This should not affect us now. ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
On Tue, 23 Nov 2021 07:53:46 GMT, Fei Yang <fyang@openjdk.org> wrote:
After some reconsideration, I think we can go one step further: simply keep the changes in file src/hotspot/share/utilities/globalDefinitions_gcc.hpp and remote all other changes in this PR.
Then USE_LIBRARY_BASED_TLS_ONLY macro won't be defined and Thread::current() will return '_ZN6Thread12_thr_currentE' in that case. This will be the same in functionality as the newly add assembly function '_ZN10JavaThread25riscv64_get_thread_helperEv'.
I have checked with Yadong, the reason for defining USE_LIBRARY_BASED_TLS_ONLY macro is to workaround some host compiler bug when accessing TLS variables in the very early days. This should not affect us now.
I think I get your point - I wonder if that `remote` should actually be a `revert`? If it is - that's great - I think this is a pretty and concise solution if we need to clobber all volatile registers anyway. Let me revise this patch and change the titles. But calling a `call_VM_leaf_base()` is a bit duplicated because this function would save another bunch of registers like x5 and x31 again, so I would recommend directly using a pair of `movptr_with_offset` and `jalr` - though it might be insignificant. ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
On Tue, 23 Nov 2021 08:39:28 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
After some reconsideration, I think we can go one step further: simply keep the changes in file src/hotspot/share/utilities/globalDefinitions_gcc.hpp and remote all other changes in this PR. Then USE_LIBRARY_BASED_TLS_ONLY macro won't be defined and Thread::current() will return '_ZN6Thread12_thr_currentE' in that case. This will be the same in functionality as the newly add assembly function '_ZN10JavaThread25riscv64_get_thread_helperEv'. I have checked with Yadong, the reason for defining USE_LIBRARY_BASED_TLS_ONLY macro is to workaround some host compiler bug when accessing TLS variables in the very early days. This should not affect us now.
I think I get your point - I wonder if that `remote` should actually be a `revert`? If it is - that's great - I think this is a pretty and concise solution if we need to clobber all volatile registers anyway. Let me revise this patch and change the titles. But calling a `call_VM_leaf_base()` is a bit duplicated because this function would save another bunch of registers like x5 and x31 again, so I would recommend directly using a pair of `movptr_with_offset` and `jalr` - though it might be insignificant.
Sorry for the typo. Should be "remove". Please do. We should keep your original changes about the comment for MacroAssembler::get_thread(Register thread) at the same time :-) ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
Hi team,
This patch removes the USE_LIBRARY_BASED_TLS_ONLY macro to use tls instead of pthread_getspecific. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
zhengxiaolinX has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Remove the USE_LIBRARY_BASED_TLS_ONLY macro to use tls instead of pthread_getspecific ------------- Changes: - all: https://git.openjdk.java.net/riscv-port/pull/7/files - new: https://git.openjdk.java.net/riscv-port/pull/7/files/d64f27ef..9fa339c5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=7&range=02 - incr: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=7&range=01-02 Stats: 44 lines in 2 files changed: 0 ins; 42 del; 2 mod Patch: https://git.openjdk.java.net/riscv-port/pull/7.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/7/head:pull/7 PR: https://git.openjdk.java.net/riscv-port/pull/7
On Tue, 23 Nov 2021 09:08:59 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
Hi team,
This patch removes the USE_LIBRARY_BASED_TLS_ONLY macro to use tls instead of pthread_getspecific. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
zhengxiaolinX has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.
Simply tested a `test/hotspot/jtreg/compiler` folder on qemu and no more errors found. Would you mind having another review of this patch? I feel pleased to receive any suggestion. Thanks, Xiaolin ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
On Tue, 23 Nov 2021 09:08:59 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
Hi team,
This patch removes the USE_LIBRARY_BASED_TLS_ONLY macro to use tls instead of pthread_getspecific. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
zhengxiaolinX has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
Remove the USE_LIBRARY_BASED_TLS_ONLY macro to use tls instead of pthread_getspecific
Looks good. Thanks. ------------- Marked as reviewed by fyang (Lead). PR: https://git.openjdk.java.net/riscv-port/pull/7
On Tue, 23 Nov 2021 09:08:59 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
Hi team,
This patch removes the USE_LIBRARY_BASED_TLS_ONLY macro to use tls instead of pthread_getspecific. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
zhengxiaolinX has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.
Sorry for that typo - ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
On Thu, 18 Nov 2021 12:36:31 GMT, zhengxiaolinX <duke@openjdk.java.net> wrote:
Hi team,
This patch removes the USE_LIBRARY_BASED_TLS_ONLY macro to use tls instead of pthread_getspecific. [Original patch](https://github.com/riscv-collab/riscv-openjdk/pull/4)
Thanks, Xiaolin
This pull request has now been integrated. Changeset: 7cf6b9d2 Author: yunyao.zxl <yunyao.zxl@alibaba-inc.com> Committer: Fei Yang <fyang@openjdk.org> URL: https://git.openjdk.java.net/riscv-port/commit/7cf6b9d27a3c628771b70b9982c17... Stats: 16 lines in 2 files changed: 4 ins; 10 del; 2 mod 8277391: riscv: Remove the USE_LIBRARY_BASED_TLS_ONLY macro to use tls instead of pthread_getspecific Reviewed-by: fyang ------------- PR: https://git.openjdk.java.net/riscv-port/pull/7
participants (2)
-
Fei Yang
-
zhengxiaolinX