RFR 8203466: intermittent crash at jdk.internal.misc.Unsafe::getObjectVolatile (native)

dean.long at oracle.com dean.long at oracle.com
Tue Sep 18 17:00:26 UTC 2018


A lot of the barrier code is saving registers around calls:

  __ pusha();             // push registers (overkill)
  /* call_vm */
  __ popa();

however, the "overkill" comment now seems like it should be "necessary 
for now".
If we wanted to stress test our stubs and make failures like this more 
reproducible,
we could remove the pusha/popa and make call_VM in debug builds trash the
volatile registers on return.

dl

On 9/18/18 7:31 AM, coleen.phillimore at oracle.com wrote:
>
> Hi, I just pushed this because I wasn't expecting more comments.
>
> On 9/18/18 9:55 AM, Doerr, Martin wrote:
>> Hi Coleen,
>>
>> is it really safe to keep a value in R9 across the barrier's runtime 
>> call?
>
> I was going to save r15 also to thread local, but since r9 wasn't the 
> problem we observed and is not aliased to something like rscratch1, 
> which invites reuse, I think it's safe for now.  I don't think this 
> code is future-proof and hand coded assembly for performance is 
> something I wish we could avoid.
>
>> It's specified as volatile in [1].
>>
>> I think it'd be nice if we had a dedicated stub calling convention. 
>> Calling (almost) only cpu specific stubs via os+cpu specific C 
>> calling conventions is a mess. But thanks for fixing the issue.
>
> I would be nice if this was less of a mess.  Let me know if you find 
> that this doesn't fix the issue.  This wasn't reproduceable.
> thanks,
> Coleen
>>
>> Best regards,
>> Martin
>>
>> [1] 
>> https://docs.microsoft.com/en-us/cpp/build/register-usage?view=vs-2017
>>
>>
>> -----Original Message-----
>> From: hotspot-dev <hotspot-dev-bounces at openjdk.java.net> On Behalf Of 
>> coleen.phillimore at oracle.com
>> Sent: Dienstag, 18. September 2018 14:55
>> To: Erik Österlund <erik.osterlund at oracle.com>; 
>> hotspot-dev at openjdk.java.net
>> Subject: Re: RFR 8203466: intermittent crash at 
>> jdk.internal.misc.Unsafe::getObjectVolatile (native)
>>
>> Thanks Erik!
>> Coleen
>>
>> On 9/18/18 5:45 AM, Erik Österlund wrote:
>>> Hi Coleen,
>>>
>>> Looks good. I guess the whole thing looks horrible in a way, but the
>>> patch is good.
>>> Thanks for fixing this.
>>>
>>> /Erik
>>>
>>> On 2018-09-18 01:47, coleen.phillimore at oracle.com wrote:
>>>>
>>>> On 9/17/18 5:17 PM, dean.long at oracle.com wrote:
>>>>> Can you move the JavaThread _WIN64 changes to thread_windows_x86.hpp?
>>>> Thanks for looking at this.  I moved it to the better place:
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/8203466.02/webrev/index.html
>>>>
>>>> Retested with build and ran one test 100 times that failed once.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>> dl
>>>>>
>>>>> On 9/17/18 5:05 AM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> On 9/14/18 4:01 PM, dean.long at oracle.com wrote:
>>>>>>> Hi Coleen.  In the bug you asked:
>>>>>>>
>>>>>>>> Why can't we just push rdi and rsi to the stack and restore
>>>>>>> them? This code appears to be leaf code so shouldn't care about
>>>>>>> stack format
>>>>>>>
>>>>>>> Did that ever get answered?  If there is concern about the pushes
>>>>>>> and pops canceling out at the end, you could always reserve the
>>>>>>> stack space at the beginning, after
>>>>>>>
>>>>>>> __ enter();
>>>>>>>
>>>>>>> and then setup_arg_regs/restore_arg_regs would not need to adjust
>>>>>>> the stack pointer.
>>>>>> The reason I didn't do this is that I couldn't get it to work for
>>>>>> all the various call sites.  Also, the barrier code is going to
>>>>>> need the thread register sometimes which wasn't available. So I
>>>>>> went with this.  I'm somewhat unclear with how this is called from
>>>>>> the c2 code.  If you want to try this solution, I should probably
>>>>>> give up and reassign it to compiler.
>>>>>>
>>>>>> thanks,
>>>>>> coleen
>>>>>>
>>>>>>> dl
>>>>>>>
>>>>>>> On 9/14/18 11:32 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>> Summary: Save windows 64 callee saved registers rsi, rdi to
>>>>>>>> Thread, save r15, also callee saved to r9
>>>>>>>>
>>>>>>>> This is done for generated stubs that do arraycopy that need GC
>>>>>>>> barrier code because GC assumes that r10 is scratch, since it's
>>>>>>>> defined as rscratch1.  See bug for more details. Thanks Erik for
>>>>>>>> your help and the diagnosis.
>>>>>>>>
>>>>>>>> Tested with mach5 hs-tier1-7.  We don't have a reproduceable test
>>>>>>>> case.
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8203466.01/webrev
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8203466
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>
>



More information about the hotspot-dev mailing list