Request for Review(S): JVM crashes when throwing StackOverflow exception from interpreter
Bertrand Delsart
bertrand.delsart at oracle.com
Thu Dec 15 05:56:24 PST 2011
Hi Yang,
I'm currently testing a modified version of the fix and I will go
through the shared runtime stubs anyway, if not exactly for the reasons
that were stated in the comments :-)
Hence, these following answers are just for information, to try to
clarify any remaining confusion.
FYI, your fix was correct from a coding point of view. The only
difference in what I'm currently testing is that I'm avoiding the
delayed loading by moving the generation of the stub to
StubGenerator::generate_initial. I'll generate the updated webrev once
my testing is complete.
As for the comments, they now seem correct in a C2 context... but got us
confused because the problem is more general. It was failing on a
platform which did not modify SP and which did not use callee-saved
register :-) The problem is indeed that we had not considered that the
caller might not be an interpreted frame. The issues with calling the
interpreter version of the throw instead of the shared runtime one are
more than just a problem of missing stub and callee-saved registers.
Note that we were still discussing it internally when we were pinged by
Volker and I opened up the discussion. I am indeed more familiar with C1
and I was still trying to get some confirmation for C2 (this is the only
only part of the JVM I've not yet been seriously involved with). The
status of ebp/rbp was not yet clear in the answers I got. I knew it was
saved/restored but, from the answers I got and the fact that it is used
in a special way for JSR292, I thought it was no longer really 'trusted'
as an allocable callee-saved registers. I was going to double-check the
AD file but I'll trust you if you saw EBP in a call site oopmap. Thanks
for double-checking. Seems that C1 is not smart enough to trust EBP's
restored value but that C2 is much smarter :-)
I'll let C2 experts chime in if they think this needs further clarification.
Thanks again for spotting the bug, providing a test case and suggesting
a solution. In fact, one of our platforms was doing the right thing
(including generating the stub in generate_initial) but we had not
noticed the discrepancy :-(
Bertrand.
On 12/15/11 11:46 AM, Wang, Yang wrote:
> Hi Bertrand,
> I don't want to insert comments on the long text, so here are some points I would like to clarify:
> 1. I understand, hotspot does not use callee-saved register in compiled java code for performance reasons. However, ebp is an exception. It is pushed in the beginning of a function(Prolog), and poped in the end(Epilog). So I don't see any difference to the callee-save concept. If there's any difference(maybe from register allocator's perspective?), could you help clarify?
>
> 2. You mentioned "the GC should not have to update any OOP in callee-saved registers except arguments". However, the location of ebp is always updated explicitly in frame traversal for the sake of GC(see function update_map_with_saved_link).
>
> 3. As SAP JVM is a server VM, all I mentioned is only about the server edition. I noticed there's difference between C1 and C2 in the context(see function sender_for_interpreter_frame). It may add extra difficulty to your review :(
>
> Thanks,
> Yang
>
> -----Original Message-----
> From: yang02.wang at sap.com
> Sent: Mittwoch, 14. Dezember 2011 18:55
> To: 'Bertrand Delsart'
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: RE: Request for Review(S): JVM crashes when throwing StackOverflow exception from interpreter
>
> Hello again,
> Well, I think I am still not clear enough about the term "callee-save". From Java ABI's perspective, there is no callee-saved registers; but from C ABI's perspective, ebp is callee-saved, and it is causing problem : it saves and restores it without interacting with GC.
>
> And I don't understand why you say " For instance, on x86, "ebp" cannot be an "oop that expects to be GCed" when at a method call site". Actually I saw "ebp" in OopMap in a call site with "PrintAssembly".
>
> Thanks,
> Yang
> -----Original Message-----
> From: Bertrand Delsart [mailto:bertrand.delsart at oracle.com]
> Sent: Mittwoch, 14. Dezember 2011 17:50
> To: Wang, Yang
> Cc: Volker Simonis; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: Request for Review(S): JVM crashes when throwing StackOverflow exception from interpreter
>
> Sorry,
>
> I just realized that x86 was not using the same stub.
>
> I got caught up in the confusion created by the comments and did not
> notice the difference between the call to
> Interpreter::throw_StackOverflowError_entry() and the call to
> SharedRuntime::throw_StackOverflowError_entry()
>
> The real reason for the failure seems to be that x86 was calling
> something that assumes the active frame is interpreted (see
> generate_StackOverflowError_handler). If this is called with a compiled
> frame context (after a c2i), this could indeed lead to issues that I
> need to investigate further.
>
> I know understand better the reasons for your fix and will see whether
> this is indeed calling the right stub and whether I could generate it
> earlier to avoid delaying the load.
>
> Thanks,
>
> Bertrand.
>
> On 12/14/11 05:14 PM, Bertrand Delsart wrote:
>> Hi Yang,
>>
>> Answers inlined.
>>
>> On 12/14/11 03:26 PM, Wang, Yang wrote:
>>> 1. For the stub concern, it is true that there is an existing stub,
>>> but it is only invoked via implicit exception handler(see function
>>> continuation_for_implicit_exception). For explicit stack overflow
>>> check in "interpreter", the stub is in hotspot not used previously. So
>>> the fix is to borrow it. Since the stub is generated after the
>>> code-generation for interpreter, the entry address was load indirectly.
>>
>> The stub was used on x86 in the call you modified :-) In addition, some
>> stubs are generated before the interpreter. That's why I asked whether
>> you had really noticed an issue (e.g. whether the code generated a jump
>> to a NULL address on x86).
>>
>> Anyway, I'll double check when the stub is generated and, if necessary,
>> see whether we can initialize it earlier instead of deferring the load.
>>
>>> 2. Going through the stub is like a guard for the VM call, as shown in
>>> the picture in webrev: stub frame on top of C frame. When GC fails to
>>> handle the OOP in C frame, the stub frame remedies it by containing a
>>> copy of the OOP, which is to be handled by GC.
>>
>> Understood. The problem is not really about whether we need a stub (I'm
>> still reviewing that for sparc). The problem is about the reasons
>> stated. I'm trying to remove the comments that might be confusing.
>>
>> For instance, on x86, "ebp" cannot be an "oop that expects to be GCed"
>> when at a method call site (elsewhere, it can indeed be an oop that
>> needs to be updated). See point 3. From a jited code point of view,
>> there is no callee-saved registers in our compiled Java ABI for x86.
>>
>> I'm currently reviewing the Sparc part because it is indeed a special
>> case... and the one that was unfortunately not using the stub :-(
>>
>>> 3. For the concern of callee-saved register, I guess you mean Sparc
>>> only. You are right that the IN and Local registers are not called
>>> "callee-saved" technically(but "never-saved"). However, they are also
>>> saved by callee in callee's frame in certain cases(like register
>>> window overflow). I hope my understanding here is correct.
>>
>> No. That was not sparc only. In fact Sparc might be the exception.
>>
>> If you look for instance at the definition of is_caller_save_register()
>> in c1_FrameMap_x86.hpp, you'll see that HotSpot now assumes that all
>> registers can be scratched by called Java methods. We measured that
>> performance were better when not using callee saved registers (for
>> compiled Java code).
>>
>> Sparc is a special case because the IN and Local registers are
>> automatically protected. This is the only platform which provides
>> callee-saved registers "for free" when building frames (thanks to the
>> register windows). Hence, I thought that our JIT compilers were no
>> longer using callee-saved registers on any of the supported platforms
>> but Sparc might indeed be the exception.
>>
>>>
>>> Thanks and welcome more inputs!
>>> Yang
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Bertrand Delsart [mailto:bertrand.delsart at oracle.com]
>>> Sent: Mittwoch, 14. Dezember 2011 13:36
>>> To: Wang, Yang
>>> Cc: Volker Simonis; hotspot-compiler-dev at openjdk.java.net; Coleen
>>> Phillimore
>>> Subject: Re: Request for Review(S): JVM crashes when throwing
>>> StackOverflow exception from interpreter
>>>
>>> Sorry,
>>>
>>> I was notified of the CR and got involved in the bug because we were
>>> wondering whether the bug was impacting our embedded ports. I sent my
>>> own review in an internal alias but should have let you know. FYI, we
>>> were still discussing it because I raised some issues about the proposed
>>> fix.
>>>
>>> First, thanks for spotting the bug! We do agree with the first problem.
>>> We should indeed ensure that SP is correct.
>>>
>>> My concerns were about the second part.
>>>
>>> First, the change is doing more that just going through the stub. It
>>> seems that your are trying to solve a third problem. x86 was already
>>> using the throw_StackOverflowError_entry stub. What your proposed change
>>> seems to imply is that you encountered a case where the stub had not yet
>>> been generated and hence you had to delay the loading of its address. Is
>>> that correct ?
>>>
>>> In addition, the explanations about why we have to go through that stub
>>> did not fit my knowledge of HotSpot and I was trying to clarify that
>>> internally (thanks Tom for your confirmations).
>>>
>>> With our Java ABI, none of the registers allocated by the compiler are
>>> callee-saved. Hence, at call sites, the GC should not have to update any
>>> OOP in callee-saved registers (but might have to update arguments). If
>>> fixing SP is not sufficient to solve your test case on SPARC then the
>>> problem is not exactly what you describe.
>>>
>>> Now, that does not mean the fix would not be correct :-) Going through
>>> that stub might be safer anyway, particularly since we are doing it on
>>> x86. My issues are about the comments themselves (and the third problem
>>> mentioned above).
>>>
>>> Thanks,
>>>
>>> Bertrand.
>>>
>>> On 12/13/11 09:19 PM, Christian Thalinger wrote:
>>>>
>>>> On Dec 7, 2011, at 10:45 AM, Volker Simonis wrote:
>>>>
>>>>> There's now a bug report for this issue:
>>>>>
>>>>> 7116216: StackOverflow GC crash
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7116216
>>>>>
>>>>> so could somebody please have a look at the fix proposed by Yang?
>>>>
>>>> Sorry, I will have a look tomorrow.
>>>>
>>>> -- Chris
>>>>
>>>>>
>>>>> Regards,
>>>>> Volker
>>>>>
>>>>> On Wed, Nov 16, 2011 at 6:04 PM, Wang, Yang<yang02.wang at sap.com> wrote:
>>>>>> Hi folks,
>>>>>>
>>>>>> We found two problems during throwing an StackOverflow exception from
>>>>>> interpreter.
>>>>>>
>>>>>> 1. Before preparing to throw a StackOverflow exception, the last Java
>>>>>> frame is set to the current sp. This is problematic when the
>>>>>> StackOverflow
>>>>>> is thrown on top of c2i adapter. Solution : the real caller
>>>>>> frame(unextended
>>>>>> sp) should be set as last Java frame.
>>>>>>
>>>>>> 2. When Garbage collection happens during throwing StackOverflow
>>>>>> exception, and callee-saved register(or "never-saved" register
>>>>>> which behaves
>>>>>> alike) happens to be an OOP, GC is unable to locate the OOP in C
>>>>>> frame(generated by VM calls), and hence fails to process the OOP.
>>>>>> Solution :
>>>>>> We build a runtime stub frame before doing a VM call, which
>>>>>> guarantees the
>>>>>> location of Callee-saved registers are always recognizable by GC.
>>>>>>
>>>>>> Detailed descriptions and tests could be found in webrev
>>>>>> http://www.sapjvm.com/yw/webrevs/StackOverflow_GC_Crash/
>>>>>>
>>>>>> I don't have a bugID yet. Please kindly open one bug for this issue.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yang
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yang Wang
>>>>>> Software Engineer
>>>>>>
>>>>>> TIP Core AS&VM Technology (AG)
>>>>>> SAP JVM JIT Compiler
>>>>>>
>>>>>> SAP AG
>>>>>> Dietmar-Hopp-Allee 16
>>>>>> 69190 Walldorf, Germany
>>>>>>
>>>>>> T +49 6227 7-50320
>>>>>> F +49 6227 78-48541
>>>>>> Email : yang02.wang at sap.com
>>>>>> www.sap.com
>>>>>>
>>>>>>
>>>>>> Pflichtangaben/Mandatory Disclosure Statements:
>>>>>> http://www.sap.com/company/legal/impressum.epx
>>>>>>
>>>>>> Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige
>>>>>> vertrauliche Informationen enthalten. Sollten Sie diese E-Mail
>>>>>> irrtümlich
>>>>>> erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine
>>>>>> Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt.
>>>>>> Bitte
>>>>>> benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail.
>>>>>> Vielen
>>>>>> Dank.
>>>>>>
>>>>>> This e-mail may contain trade secrets or privileged, undisclosed, or
>>>>>> otherwise confidential information. If you have received this
>>>>>> e-mail in
>>>>>> error, you are hereby notified that any review, copying, or
>>>>>> distribution of
>>>>>> it is strictly prohibited. Please inform us immediately and destroy
>>>>>> the
>>>>>> original transmittal. Thank you for your cooperation.
>>>>>>
>>>>>>
>>>>
>>>
>>>
>>
>>
>
>
--
Bertrand Delsart, bertrand.delsart at oracle.com,
Sun-Oracle France, 180 av. de l'Europe, ZIRST de Montbonnot,
38334 Saint Ismier, FRANCE
More information about the hotspot-compiler-dev
mailing list