[9] RFR(S): 8172731: runtime/Thread/TooSmallStackSize.java fails on solaris-x64 with product build

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Jan 17 14:29:51 UTC 2017


On 1/17/17 12:49 AM, Tobias Hartmann wrote:
> Hi,
>
> On 17.01.2017 04:51, Daniel D. Daugherty wrote:
>> On 1/16/17 5:53 PM, David Holmes wrote:
>>> On 17/01/2017 12:26 AM, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> Goetz suggested (off-thread) to add a comment to the patch stating the reason for the large stack size:
>>>> http://cr.openjdk.java.net/~thartmann/8172731/webrev.01/
>>>>
>>>> Are you fine with that?
>>> Probably should say "Solaris Studio" rather than "Sun"
> Okay, I just added the output of "cc -V". Change it to "Solaris Studio".
>
>> It would also be good to document that the fastdebug version
>> requires less stack space than the release version. That bit
>> of strangeness should be called out clearly...
> Right, I added this to the comment.
>
>> I'm good with pushing this fix.
> For the record, I'm pushing this:
> http://cr.openjdk.java.net/~thartmann/8172731/webrev.02/

Looks good to me.

Again, thanks for fixing this bug.

Dan


>
> Thanks,
> Tobias
>
>> Dan
>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> On 16.01.2017 13:38, Tobias Hartmann wrote:
>>>>> Hi David,
>>>>>
>>>>> On 16.01.2017 13:33, David Holmes wrote:
>>>>>> Sorry I missed answering the "should we file a studio compiler bug" question. Don't we generate the MachNodeGenerator code using adlc? What does the thing we generate actually look like?
>>>>> Yes, the C(++) code is generated using adlc. It looks like this:
>>>>>
>>>>> //------------------------- MachNode Generator ---------------
>>>>> // A switch statement on the dense-packed user-defined type system
>>>>> // that invokes 'new' on the corresponding class constructor.
>>>>>
>>>>> MachNode *State::MachNodeGenerator(int opcode){
>>>>>    switch(opcode) {
>>>>>    case loadB_rule: {
>>>>>        loadBNode *node = new loadBNode();
>>>>>        return node;
>>>>>      }
>>>>>    case loadB2L_rule: {
>>>>>        loadB2LNode *node = new loadB2LNode();
>>>>>        return node;
>>>>>      }
>>>>>    case loadUB_rule: {
>>>>>        loadUBNode *node = new loadUBNode();
>>>>>        return node;
>>>>>      }
>>>>>    case loadUB2L_rule: {
>>>>>
>>>>> [...]
>>>>>
>>>>>> David
>>>>>>
>>>>>> On 16/01/2017 10:21 PM, David Holmes wrote:
>>>>>>> On 16/01/2017 10:15 PM, Tobias Hartmann wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> thanks to everyone for looking at this.
>>>>>>>>
>>>>>>>>> My concern with this series of changes is that we're claiming to be
>>>>>>>>> setting shared posix values when it seems in fact the requirements
>>>>>>>>> are different across "posix" platforms. We now penalize all posix
>>>>>>>>> platforms with a larger minimum stack due to this Solaris x64 issue.
>>>>>>>> Why do we penalize all platforms with this fix? The changes are to
>>>>>>>> 'os_solaris_x86.cpp' and only affect a Solaris x86 build.
>>>>>>> Sorry I was mis-remembering that despite being called
>>>>>>> os::Posix::_compiler_thread_min_stack_allowed it is actually a platform
>>>>>>> specific value.
>>>>> Okay, no problem.
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>>> I'm also very curious as to why Solaris x64 needs to be different?
>>>>>>>> [see below]
>>>>>>>>
>>>>>>>>>> Anyways, this size looks strange. Other platforms get along with 32K
>>>>>>>>>> and 48K
>>>>>>>>>> compiler thread size!  Why does this platform need ten times more
>>>>>>>>>> than others?
>>>>>>>>>> Is it because C2 optimizations are configured more aggressive,
>>>>>>>>>> leading to bigger
>>>>>>>>>> compile tasks during startup?
>>>>>>>>>> But solaris_sparc also needs a lot (102K) as well as aix (192K). At
>>>>>>>>>> some point
>>>>>>>>>> I'll have a look at why aix needs that much.
>>>>>>>> I don't think it's due to C2 optimizations. They shouldn't differ much
>>>>>>>> on Solaris x86 compared to Linux x86.
>>>>>>>>
>>>>>>>> I had a closer look at the failures and we always fail in the C2
>>>>>>>> matcher in the generated method State::MachNodeGenerator(int) which
>>>>>>>> basically consists of a *huge* switch statement matching opcodes.
>>>>>>>> Turns out that with a product build we reserve lots of stack space on
>>>>>>>> method entry:
>>>>>>>>
>>>>>>>> 0xffff80ff92537cf0<MachNode*State::MachNodeGenerator(int)>: push
>>>>>>>> %rbp
>>>>>>>> 0xffff80ff92537cf1<MachNode*State::MachNodeGenerator(int)+1>: mov
>>>>>>>> %rsp,%rbp
>>>>>>>> 0xffff80ff92537cf4<MachNode*State::MachNodeGenerator(int)+4>: push
>>>>>>>> %r15
>>>>>>>> 0xffff80ff92537cf6<MachNode*State::MachNodeGenerator(int)+6>: sub
>>>>>>>> $0x53ba8,%rsp
>>>>>>>>
>>>>>>>> This requires ~335k of stack space whereas a fastdebug build only
>>>>>>>> requires ~15k for the same method:
>>>>>>>>
>>>>>>>> 0xffff80ff7c8a0460<MachNode*State::MachNodeGenerator(int)>: push
>>>>>>>> %rbp
>>>>>>>> 0xffff80ff7c8a0461<MachNode*State::MachNodeGenerator(int)+1>: mov
>>>>>>>> %rsp,%rbp
>>>>>>>> 0xffff80ff7c8a0464<MachNode*State::MachNodeGenerator(int)+4>: sub
>>>>>>>> $0x9430,%rsp
>>>>>>>>
>>>>>>>> This is consistent with my findings that a fastdebug VM requires a
>>>>>>>> CompilerThreadStackSize of 155k whereas a product build requires 384k.
>>>>>>>>
>>>>>>>> I guess that this difference is due to more aggressive optimizations
>>>>>>>> done by the Sun Studio Compiler for product builds but from looking at
>>>>>>>> the assembly code, it's not obvious what all that space is actually
>>>>>>>> used for.
>>>>>>>>
>>>>>>>> Do you think we should file a Sun Studio Compiler bug for this?
>>>>>>>>
>>>>>>>>>> Basically this is the right fix for the immediate problem, though.
>>>>>>>>>> Looks good.
>>>>>>>> Thanks, if there are no objections I would like to push this fix.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Tobias
>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>    Goetz
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>>>>>>> Sent: Samstag, 14. Januar 2017 01:04
>>>>>>>>>>> To: Tobias Hartmann <tobias.hartmann at oracle.com>; hotspot-
>>>>>>>>>>> dev at openjdk.java.net Developers <hotspot-dev at openjdk.java.net>;
>>>>>>>>>>> Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>>>>>>>>>>> Subject: Re: [9] RFR(S): 8172731:
>>>>>>>>>>> runtime/Thread/TooSmallStackSize.java fails
>>>>>>>>>>> on solaris-x64 with product build
>>>>>>>>>>>
>>>>>>>>>>> Adding Goetz to this email thread directly.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 1/13/17 10:37 AM, Tobias Hartmann wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> please review the following patch:
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8172731
>>>>>>>>>>>> http://cr.openjdk.java.net/~thartmann/8172731/webrev.00/
>>>>>>>>>>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp
>>>>>>>>>>>       No comments.
>>>>>>>>>>>
>>>>>>>>>>> Thumbs up! Thanks for fixing this issue.
>>>>>>>>>>>
>>>>>>>>>>> Goetz, can you please chime in on this thread since you were the
>>>>>>>>>>> author for:
>>>>>>>>>>>
>>>>>>>>>>>       JDK-8170655: [posix] Fix minimum stack size computations
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8170655
>>>>>>>>>>>
>>>>>>>>>>> My in-house testing as sponsor didn't catch this issue and now
>>>>>>>>>>> I look more closely at your testing comment in the bug report,
>>>>>>>>>>> I don't see Solaris X64 listed.
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> The test fails because the VM crashes with a
>>>>>>>>>>>> CompilerThreadStackSize of
>>>>>>>>>>> 300k. The problem is caused by the fix for JDK-8170655 [1] which
>>>>>>>>>>> now allows
>>>>>>>>>>> a minimum CompilerThreadStackSize of 300k, before it was 396k. 300k
>>>>>>>>>>> is not
>>>>>>>>>>> enough to start the VM on Solaris x86. This didn't show up when the
>>>>>>>>>>> fix was
>>>>>>>>>>> integrated because it was only tested with debug builds (with a
>>>>>>>>>>> debug build
>>>>>>>>>>> the number of StackShadowPages is increased by 2 and therefore the
>>>>>>>>>>> minimal CompilerThreadStackSize is larger).
>>>>>>>>>>>> Setting the guard pages to the minimum size via "-
>>>>>>>>>>> XX:StackShadowPages=10 -XX:StackRedPages=1 -XX:StackYellowPages=2"
>>>>>>>>>>> requires a CompilerThreadStackSize of at least 381k to not crash at
>>>>>>>>>>> startup
>>>>>>>>>>> with a product build but the VM only checks for >= 260k. I
>>>>>>>>>>> increased the
>>>>>>>>>>> value of os::Posix::_compiler_thread_min_stack_allowed by 123 to
>>>>>>>>>>> 325 such
>>>>>>>>>>> that the minimum allowed CompilerThreadStackSize is 384k (due to the
>>>>>>>>>>> additional space allocated for guard pages and rounding to the page
>>>>>>>>>>> size).
>>>>>>>>>>>> Tested with failing test, manually and with RBT (running).
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Tobias
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8170655



More information about the hotspot-dev mailing list