Change 6420645 causes SIGBUS during deoptimisation at a safepoint on 64bit-SPARC

Tue Sep 1 01:54:53 PDT 2009

Intuitively I would prefer the new implementation and would rather fix
the problem by changing the VMReg numbering in sparc.ad if this
doesn't hurt otherwise.

If the new implementation is not necessary for compressed oops, I
think it may be at least necessary if one would implement
'float_in_double' on Sparc eventually.

Could you please send me the Bug-ID of this issue once you found/created it?

Thank you and best regards,
Volker

On 9/1/09, Coleen Phillimore <Coleen.Phillimore at sun.com> wrote:
>
>  Hi,
>
>  The code that is causing this crash was changed during development of
> compressed oops and I erroneously thought it was equivalent to the original
> code, and I found it easier to understand, so I checked it in.
>  So no, there was no rationale behind this change to make compressed oops
> work. It should be backed out (and your test added to our test system).  I
> will file a new bug if there isn't one already.
>  As far as the strange encoding of floating point registers on sparc, I
> don't know the rationale behind it.  Maybe someone on the compiler mailing
> list can answer that.
>
>  Thank you for finding this.
>  Coleen
>
>
>  Volker Simonis wrote:
>
> > Hi Coleen,
> >
> > I discovered a problem during deoptimisation at a safepoint which
> > leads to a SIGBUS on 64bit-SPARC. The problem was introduced by the
> > change "6420645: Create a vm that uses compressed oops for up to 32gb
> > heapsizes" which has been submitted by you. The problem is easily
> > reproducible with the attached test program. Just run:
> >
> > java -d64 -server -showversion -Xcomp -Xbatch
> > "-XX:CompileCommand=compileonly DeoptTest
> > deopt_compiledframe_at_safepoint" -XX:+PrintCompilation
> DeoptTest
> >
> > and you will get a VM crash like:
> >
> > CompilerOracle: compileonly
> DeoptTest.deopt_compiledframe_at_safepoint
> > java version "1.6.0_14"
> > Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
> > Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, compiled mode)
> >
> >  1   b   DeoptTest::deopt_compiledframe_at_safepoint (220
> bytes)
> >  1   made not entrant
> DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> > #
> > # A fatal error has been detected by the Java Runtime Environment:
> > #
> > #  SIGBUS (0xa) at pc=0xffffffff7e37514c, pid=9314, tid=15
> > #
> > # JRE version: 6.0_14-b08
> > # Java VM: Java HotSpot(TM) 64-Bit Server VM (14.0-b16 compiled mode
> > solaris-sparc )
> > # Problematic frame:
> > # V  [libjvm.so+0x77514c]
> >
> > As noticed before, the error is a regression of change 6420645. It
> > doesn't happen with earlier versions of the HotSpot. For example 6u13
> > with HS 11 runs the test just fine:
> >
> > CompilerOracle: compileonly
> DeoptTest.deopt_compiledframe_at_safepoint
> > java version "1.6.0_13"
> > Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
> > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, compiled mode)
> >
> >  1   b   DeoptTest::deopt_compiledframe_at_safepoint (220
> bytes)
> >  1   made not entrant
> DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> > OK
> >
> >
> > Notice that the problem is still present in the HS head revsion. I've
> > tried with 7-ea b70:
> >
> > java version "1.7.0-ea"
> > Java(TM) SE Runtime Environment (build 1.7.0-ea-b70)
> > Java HotSpot(TM) 64-Bit Server VM (build 16.0-b07, compiled mode)
> >
> >  1   b   DeoptTest::deopt_compiledframe_at_safepoint (220
> bytes)
> >  1   made not entrant
> DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
> > #
> > # A fatal error has been detected by the Java Runtime Environment:
> > #
> > #  SIGBUS (0xa) at pc=0xffffffff7e38c7fc, pid=10714, tid=15
> > #
> > # JRE version: 7.0-b70
> > # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b07 compiled mode
> > solaris-sparc )
> > # Problematic frame:
> > # V  [libjvm.so+0x78c7fc]
> >
> >
> > The problem is caused be the following changes in frame.cpp:
> >
> >
> > --- a/src/share/vm/runtime/frame.cpp Sat Dec 01 00:00:00
> 2007 +0000
> > +++ b/src/share/vm/runtime/frame.cpp Sun Apr 13 17:43:42
> 2008 -0400
> > @@ -1153,9 +1153,8 @@ oop*
> frame::oopmapreg_to_location(VMReg
> >     // If it is passed in a register, it got spilled in the stub frame.
> >     return (oop *)reg_map->location(reg);
> >   } else {
> > -    int sp_offset_in_stack_slots = reg->reg2stack();
> > -    int sp_offset = sp_offset_in_stack_slots >> (LogBytesPerWord -
> > LogBytesPerInt);
> > -    return (oop *)&unextended_sp()[sp_offset];
> > +    int sp_offset_in_bytes = reg->reg2stack() *
> VMRegImpl::stack_slot_size;
> > +    return (oop*)(((address)unextended_sp()) +
> sp_offset_in_bytes);
> >   }
> >  }
> >
> > and the fact that reg->reg2stack() returns odd values for float
> > registers >= F32. This finally leads to a BUS error due to an
> > unaligned double read when the location of the register is accessed
> > through the reg_map during deoptimisation in
> > StackValue::create_stack_value(). In the old
> implementation, this was
> > hidden by the right shift in frame::oopmapreg_to_location() which
> > mapped F32 and F33 to the same stack offset.
> >
> > The problem can be easily solved by switching back to the old
> > implementation of frame::oopmapreg_to_location(), but I
> assume there
> > was a rational behind the change and that the new implementation is
> > probably necessary for compressed oops (at least that's what the whole
> > change was all about). So I dug a little further and found that in my
> > opinion the root cause of the whole problem is the strange numbering
> > of the 16 upper double registers in sparc.ad. They are defined as
> > follows:
> >
> > reg_def R_D32x(SOC, SOC, Op_RegD,255, F32->as_VMReg());
> > reg_def R_D32 (SOC, SOC, Op_RegD,  1, F32->as_VMReg()->next());
> > reg_def R_D34x(SOC, SOC, Op_RegD,255, F34->as_VMReg());
> > reg_def R_D34 (SOC, SOC, Op_RegD,  3, F34->as_VMReg()->next());
> > ...
> > reg_def R_D62x(SOC, SOC, Op_RegD,255, F62->as_VMReg());
> > reg_def R_D62 (SOC, SOC, Op_RegD, 31, F62->as_VMReg()->next());
> >
> > This maps the invalid half (R_D32x, R_D34x, ..) of the double
> > registers F32-F62 to even VMReg numbers (96, 98, ..) and the valid
> > part (R_D32, R_D34, ..) to odd VMReg numbers (97, 99, ..). Later on,
> > when the locals array for the safepoint is constructed in
> > Compile::FillLocArray(), the call to OptoReg::as_VMReg(regnum) for a
> > valid, even double register >= F32 (e.g. 96) returns the invalid, odd
> > part (e.g. 97). This odd VMReg number is than stored in the Location
> > part of the local and leads to the undesired behaviour in the new
> > implementation of frame::oopmapreg_to_location() as described before.
> >
> > I don't know why this strange encoding has been chosen for the 16
> > upper double registers in sparc.ad, but changing it to:
> >
> > reg_def R_D32x(SOC, SOC, Op_RegD,255, F32->as_VMReg()->next());
> > reg_def R_D32 (SOC, SOC, Op_RegD,  1, F32->as_VMReg());
> > reg_def R_D34x(SOC, SOC, Op_RegD,255, F34->as_VMReg()->next());
> > reg_def R_D34 (SOC, SOC, Op_RegD,  3, F34->as_VMReg());
> > ...
> > reg_def R_D62x(SOC, SOC, Op_RegD,255, F62->as_VMReg()->next());
> > reg_def R_D62 (SOC, SOC, Op_RegD, 31, F62->as_VMReg());
> >
> > which seems more natural to me, solved the SIGBUS issue and didn't
> > revealed any other problems in the tests which I run so far.
> >
> > Could you please comment on the proposed solution of changing the
> > VMReg numbering of F32-F62 or advice a better solution if you think
> > that the proposed one will not work in the general case?
> >
> > Thank you and best regards,
> > Volker
> >
> > PS: while I was hunting the error, I also stumbled across the
> > following code in RegisterSaver::save_live_registers():
> >
> >  // Save all the FP registers
> >  int offset = d00_offset;
> >  for( int i=0; i<64; i+=2 ) {
> >    FloatRegister f = as_FloatRegister(i);
> >    __ stf(FloatRegisterImpl::D,  f, SP, offset+STACK_BIAS);
> >
> map->set_callee_saved(VMRegImpl::stack2reg(offset>>2),
> f->as_VMReg());
> >    if (true) {
> >      map->set_callee_saved(VMRegImpl::stack2reg((offset +
> > sizeof(float))>>2), f->as_VMReg()->next());
> >    }
> >    offset += sizeof(double);
> >  }
> >
> > In my opinion, this could be changed to:
> >
> >  // Save all the FP registers
> >  int offset = d00_offset;
> >  for( int i=0; i<64; i+=2 ) {
> >    FloatRegister f = as_FloatRegister(i);
> >    __ stf(FloatRegisterImpl::D,  f, SP, offset+STACK_BIAS);
> >
> map->set_callee_saved(VMRegImpl::stack2reg(offset>>2),
> f->as_VMReg());
> >    if (i < 32) { // VS 2009-08-31: the 16 upper double registers
> > can't be used as floats anyway
> >      map->set_callee_saved(VMRegImpl::stack2reg((offset +
> > sizeof(float))>>2), f->as_VMReg()->next());
> >    }
> >    offset += sizeof(double);
> >  }
> >
> > because the 16 upper double registers can't be used as floats anyway.
> > Again, I didn't found any regression in my few tests. What do you
> > think?
> >
> >
>
>