[10] RFR(L): 8185979: PPC64: Implement SHA2 intrinsic

Gustavo Romero gromero at linux.vnet.ibm.com
Thu Sep 7 00:28:34 UTC 2017


Hi Martin,

On 01-09-2017 12:39, Doerr, Martin wrote:
> It'd also be good to know, if relying on vrsave=-1 is safe.

VRSAVE is set to -1 in kernelspace on a VEC or a VSX unavailable exception, in
load_up_altivec(), arch/powerpc/kernel/vector.S [1]:

 51         /*
 52          * While userspace in general ignores VRSAVE, glibc uses it as a boolean
 53          * to optimise userspace context save/restore. Whenever we take an
 54          * altivec unavailable exception we must set VRSAVE to something non
 55          * zero. Set it to all 1s. See also the programming note in the ISA.
 56          */
 57         mfspr   r4,SPRN_VRSAVE
 58         cmpwi   0,r4,0
 59         bne+    1f
 60         li      r4,-1
 61         mtspr   SPRN_VRSAVE,r4

All program images are created with MSR_VEC and MSR_VSX disabled (set to zero)
and VRSAVE set to zero as well. However, on the first execution of a vector
(VMX/Altivec) or a VSX (Vector-Scalar) instruction an exception is raised and
the exception code path calls load_up_altivec() that will set VRSAVE=-1 if it's
equal to zero (load_up_vsx() calls load_up_altivec()).

The check on lines 58 and 59 guarantees that if a userspace program desires to
set VRSAVE it can freely set the VRSAVE and on a new VEC /VSX exception VRSAVE
value won't be clobbed (set again to -1) and will stay as user set it (a new
exception can occur if a sufficient amount of context switches happen and
MSR_VEC and MSR_VSX bits get disabled as part of kernel's mechanism to avoid the
burden of saving/restoring the vec, fp, and vsx registers on context switches if
they are not in use by the a program).

For instance, a simple program like that compiled with
'gcc -static -O0 vrsave_.c -o vrsave_':

vrsave_.c:

int main() {}

will trigger VRSAVE=-1 in kernelspace once it executes a 'stvx' VSX instruction:
in __vmx_sigsetjmp():

$ gdb -q ./vrsave_
(gdb) x/i 0x1000db20
   0x1000db20 <__vmx__sigsetjmp+424>:	stvx    v20,0,r5

The address 0x1000db20 can be determined by a crafted systemtap probe [2], for
instance:

start_thread.return   : vrsave=0x0 start=0x10000840
load_up_altivec.call  :
trap=0xf21 nip=0x1000db20 vrsave=0x0
 0xc00000000000c7c0 : load_up_altivec+0x0/0x164 [kernel]
 0x0 (inexact)
load_up_altivec.return:
trap=0xf21 nip=0x1000db20 vrsave=0xffffffff
Returning from:  0xc00000000000c7c0 : load_up_altivec+0x0/0x164 [kernel]
Returning to  :  0xc000000000009c78 : altivec_unavailable_common+0xf8/0x150 [kernel]
 0x0 (inexact)
__switch_to().call      : vrsave=0xffffffff, nip=0x1002c4bc

trap=0xf21 means a "Vector Unavailable" [3] exception was taken.

-- 

For a non-static linkage we get almost the same, but a VSX unavailable exception
(trap=0xf41), caused by a 'xxspltd' in memset() (0x7fff8745d428):

start_thread.return   : vrsave=0x0 start=0x7fff874312e0
load_up_altivec.call  :
trap=0xf41 nip=0x7fff8745d428 vrsave=0x0
 0xc00000000000c7c0 : load_up_altivec+0x0/0x164 [kernel]
 0x0 (inexact)
load_up_altivec.return:
trap=0xf41 nip=0x7fff8745d428 vrsave=0xffffffff
Returning from:  0xc00000000000c7c0 : load_up_altivec+0x0/0x164 [kernel]
Returning to  :  0xc00000000000ca5c : load_up_vsx+0x10/0x2c [kernel]
 0x0 (inexact)
__switch_to.call      : vrsave=0xffffffff, nip=0x100005ac

(gdb) x/i 0x7fff8745d428
   0x7fff8745d428 <memset+552>:	xxspltd vs0,vs12,0
(gdb) bt
#0  memset (dstpp=0x7fffffffeec0, c=0, len=640) at ../string/memset.c:29
#1  0x00007ffff7fa24d4 in _dl_start (arg=0x7ffffffff410) at rtld.c:373
#2  0x00007ffff7fa12f8 in _start () from /lib64/ld64.so.2

--

Setting VRSAVE=-1 in load_up_altivec() dates back to kernel v2.6.31-rc1 at least
accordingly to 'git tag --contains e821ea70f', hence it's pretty old.

Martin, please let me know if you need any additional information to check if we
can rely on vrsave=-1 and so forget about taking care of vrsave in the JVM also
for Linux BE.


Best regards,
Gustavo

[1] https://github.com/torvalds/linux/blob/master/arch/powerpc/kernel/vector.S#L52
[2] http://cr.openjdk.java.net/~gromero/script.d
[3] ISA 3.0. Figure 68, "Effective address of interrupt vector by interrupt type"



More information about the hotspot-compiler-dev mailing list