Change 6420645 causes SIGBUS during deoptimisation at a safepoint on 64bit-SPARC
Coleen Phillimore
Coleen.Phillimore at Sun.COM
Mon Aug 31 15:35:26 PDT 2009
Hi,
The code that is causing this crash was changed during development of
compressed oops and I erroneously thought it was equivalent to the
original code, and I found it easier to understand, so I checked it in.
So no, there was no rationale behind this change to make compressed oops
work. It should be backed out (and your test added to our test system).
I will file a new bug if there isn't one already.
As far as the strange encoding of floating point registers on sparc, I
don't know the rationale behind it. Maybe someone on the compiler
mailing list can answer that.
Thank you for finding this.
Coleen
Volker Simonis wrote:
> Hi Coleen,
>
> I discovered a problem during deoptimisation at a safepoint which
> leads to a SIGBUS on 64bit-SPARC. The problem was introduced by the
> change "6420645: Create a vm that uses compressed oops for up to 32gb
> heapsizes" which has been submitted by you. The problem is easily
> reproducible with the attached test program. Just run:
>
> java -d64 -server -showversion -Xcomp -Xbatch
> "-XX:CompileCommand=compileonly DeoptTest
> deopt_compiledframe_at_safepoint" -XX:+PrintCompilation DeoptTest
>
> and you will get a VM crash like:
>
> CompilerOracle: compileonly DeoptTest.deopt_compiledframe_at_safepoint
> java version "1.6.0_14"
> Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
> Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, compiled mode)
>
> 1 b DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> 1 made not entrant DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGBUS (0xa) at pc=0xffffffff7e37514c, pid=9314, tid=15
> #
> # JRE version: 6.0_14-b08
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (14.0-b16 compiled mode
> solaris-sparc )
> # Problematic frame:
> # V [libjvm.so+0x77514c]
>
> As noticed before, the error is a regression of change 6420645. It
> doesn't happen with earlier versions of the HotSpot. For example 6u13
> with HS 11 runs the test just fine:
>
> CompilerOracle: compileonly DeoptTest.deopt_compiledframe_at_safepoint
> java version "1.6.0_13"
> Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
> Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, compiled mode)
>
> 1 b DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> 1 made not entrant DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> OK
>
>
> Notice that the problem is still present in the HS head revsion. I've
> tried with 7-ea b70:
>
> java version "1.7.0-ea"
> Java(TM) SE Runtime Environment (build 1.7.0-ea-b70)
> Java HotSpot(TM) 64-Bit Server VM (build 16.0-b07, compiled mode)
>
> 1 b DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> 1 made not entrant DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGBUS (0xa) at pc=0xffffffff7e38c7fc, pid=10714, tid=15
> #
> # JRE version: 7.0-b70
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b07 compiled mode
> solaris-sparc )
> # Problematic frame:
> # V [libjvm.so+0x78c7fc]
>
>
> The problem is caused be the following changes in frame.cpp:
>
>
> --- a/src/share/vm/runtime/frame.cpp Sat Dec 01 00:00:00 2007 +0000
> +++ b/src/share/vm/runtime/frame.cpp Sun Apr 13 17:43:42 2008 -0400
> @@ -1153,9 +1153,8 @@ oop* frame::oopmapreg_to_location(VMReg
> // If it is passed in a register, it got spilled in the stub frame.
> return (oop *)reg_map->location(reg);
> } else {
> - int sp_offset_in_stack_slots = reg->reg2stack();
> - int sp_offset = sp_offset_in_stack_slots >> (LogBytesPerWord -
> LogBytesPerInt);
> - return (oop *)&unextended_sp()[sp_offset];
> + int sp_offset_in_bytes = reg->reg2stack() * VMRegImpl::stack_slot_size;
> + return (oop*)(((address)unextended_sp()) + sp_offset_in_bytes);
> }
> }
>
> and the fact that reg->reg2stack() returns odd values for float
> registers >= F32. This finally leads to a BUS error due to an
> unaligned double read when the location of the register is accessed
> through the reg_map during deoptimisation in
> StackValue::create_stack_value(). In the old implementation, this was
> hidden by the right shift in frame::oopmapreg_to_location() which
> mapped F32 and F33 to the same stack offset.
>
> The problem can be easily solved by switching back to the old
> implementation of frame::oopmapreg_to_location(), but I assume there
> was a rational behind the change and that the new implementation is
> probably necessary for compressed oops (at least that's what the whole
> change was all about). So I dug a little further and found that in my
> opinion the root cause of the whole problem is the strange numbering
> of the 16 upper double registers in sparc.ad. They are defined as
> follows:
>
> reg_def R_D32x(SOC, SOC, Op_RegD,255, F32->as_VMReg());
> reg_def R_D32 (SOC, SOC, Op_RegD, 1, F32->as_VMReg()->next());
> reg_def R_D34x(SOC, SOC, Op_RegD,255, F34->as_VMReg());
> reg_def R_D34 (SOC, SOC, Op_RegD, 3, F34->as_VMReg()->next());
> ...
> reg_def R_D62x(SOC, SOC, Op_RegD,255, F62->as_VMReg());
> reg_def R_D62 (SOC, SOC, Op_RegD, 31, F62->as_VMReg()->next());
>
> This maps the invalid half (R_D32x, R_D34x, ..) of the double
> registers F32-F62 to even VMReg numbers (96, 98, ..) and the valid
> part (R_D32, R_D34, ..) to odd VMReg numbers (97, 99, ..). Later on,
> when the locals array for the safepoint is constructed in
> Compile::FillLocArray(), the call to OptoReg::as_VMReg(regnum) for a
> valid, even double register >= F32 (e.g. 96) returns the invalid, odd
> part (e.g. 97). This odd VMReg number is than stored in the Location
> part of the local and leads to the undesired behaviour in the new
> implementation of frame::oopmapreg_to_location() as described before.
>
> I don't know why this strange encoding has been chosen for the 16
> upper double registers in sparc.ad, but changing it to:
>
> reg_def R_D32x(SOC, SOC, Op_RegD,255, F32->as_VMReg()->next());
> reg_def R_D32 (SOC, SOC, Op_RegD, 1, F32->as_VMReg());
> reg_def R_D34x(SOC, SOC, Op_RegD,255, F34->as_VMReg()->next());
> reg_def R_D34 (SOC, SOC, Op_RegD, 3, F34->as_VMReg());
> ...
> reg_def R_D62x(SOC, SOC, Op_RegD,255, F62->as_VMReg()->next());
> reg_def R_D62 (SOC, SOC, Op_RegD, 31, F62->as_VMReg());
>
> which seems more natural to me, solved the SIGBUS issue and didn't
> revealed any other problems in the tests which I run so far.
>
> Could you please comment on the proposed solution of changing the
> VMReg numbering of F32-F62 or advice a better solution if you think
> that the proposed one will not work in the general case?
>
> Thank you and best regards,
> Volker
>
> PS: while I was hunting the error, I also stumbled across the
> following code in RegisterSaver::save_live_registers():
>
> // Save all the FP registers
> int offset = d00_offset;
> for( int i=0; i<64; i+=2 ) {
> FloatRegister f = as_FloatRegister(i);
> __ stf(FloatRegisterImpl::D, f, SP, offset+STACK_BIAS);
> map->set_callee_saved(VMRegImpl::stack2reg(offset>>2), f->as_VMReg());
> if (true) {
> map->set_callee_saved(VMRegImpl::stack2reg((offset +
> sizeof(float))>>2), f->as_VMReg()->next());
> }
> offset += sizeof(double);
> }
>
> In my opinion, this could be changed to:
>
> // Save all the FP registers
> int offset = d00_offset;
> for( int i=0; i<64; i+=2 ) {
> FloatRegister f = as_FloatRegister(i);
> __ stf(FloatRegisterImpl::D, f, SP, offset+STACK_BIAS);
> map->set_callee_saved(VMRegImpl::stack2reg(offset>>2), f->as_VMReg());
> if (i < 32) { // VS 2009-08-31: the 16 upper double registers
> can't be used as floats anyway
> map->set_callee_saved(VMRegImpl::stack2reg((offset +
> sizeof(float))>>2), f->as_VMReg()->next());
> }
> offset += sizeof(double);
> }
>
> because the 16 upper double registers can't be used as floats anyway.
> Again, I didn't found any regression in my few tests. What do you
> think?
>
More information about the hotspot-compiler-dev
mailing list