RFR 8191278: MappedByteBuffer bulk access memory failures are not handled gracefully

Tue Jun 4 01:06:18 UTC 2019

Looks good.

Thanks,
Vladimir

On 6/3/19 11:42 AM, Jamsheed wrote:
> Hi Vladimir, Dan,
> 
> added two cases in  Assembler::locate_operand as movdqa was unrecognized. update webrev in place
> 
> +    case 0x6F: // movdq
> +    case 0x7F: // movdq
> 
> full diff : http://cr.openjdk.java.net/~jcm/8191278/webrev.06/
> 
> http://cr.openjdk.java.net/~jcm/8191278/webrev.05/ + (incremental patch) 
> http://cr.openjdk.java.net/~jcm/8191278/webrev.06_02/
> 
> Best regards,
> 
> Jamsheed
> 
> On 31/05/19 10:34 PM, Jamsheed wrote:
>> Hi Vladimir, Dan,
>>
>> On 15/05/19 2:24 AM, Vladimir Kozlov wrote:
>>> Sorry, I missed this. Please "ping" if you do not get responses.
>>>
>>> It become even more complicated :( but I understand why you are doing this way. You did great job.
>>>
>>> I am thinking about separating the fix for arraycopy stubs fix and adding graceful exit for Unsafe_CopyMemory and 
>>> Unsafe_CopySwapMemory.
>>   It would be really good if it can be done in runtime, i tried this approach as it seemed far more easier.
>>
>>   aarch64 port cpp implementation actually crashes for Unsafe_CopyMemory.
>>
>>>
>>> My main concern is new swap code complicates reliable arraycopy code for very corner case. And you implemented it 
>>> only for x64 anyway.
>>
>> copySwap cpp code may work for all cases (depends on compiler again) , so i am ok with removing the changes.
>>
>> but today when i was testing the removed code[1], found another issue. some x86 instructions generated by cpp copyswap 
>> code  is not recognized by our assembler and it crashes in signalHandler Assembler::locate_next_instruction(pc). i 
>> will investigate and file a bug.
>>
>> Best regards,
>>
>> Jamsheed
>>
>> [1]
>>
>> full diff : http://cr.openjdk.java.net/~jcm/8191278/webrev.06/
>>
>> http://cr.openjdk.java.net/~jcm/8191278/webrev.05/ + (incremental patch) 
>> http://cr.openjdk.java.net/~jcm/8191278/webrev.06_02/
>>
>>>
>>> I would like to hear what Runtime group think.
>>>
>>> On 4/30/19 8:17 PM, Jamsheed wrote:
>>>> Hi Vladimir,
>>>>
>>>> Thank you for all the feedback.
>>>>
>>>> please find the revised webrev.
>>>>
>>>> http://cr.openjdk.java.net/~jcm/8191278/webrev.05/
>>>>
>>>> what changed
>>>>
>>>> 1) Unsafe Copy Memory regions, error exits are captured using UnsafeCopyMemoryMark, UnsafeCopyMemory.
>>>
>>> This is good.
>>>
>>>>
>>>> 2) all Unsafe copy (intrinsic or native) ,uses array copy stub.
>>>
>>> Right, otherwise we would have to duplicate logic in Copy:: platform specific C++ methods. But may be it is fine to 
>>> do in C++ in this case. Or not do that at all as other platforms.
>>>
>>>>
>>>> 3) Unsafe copyswap x64 implementation extends arraycopy stub and used as reliable exit(fast exit too)*. it is not 
>>>> implemented for other platforms.
>>>
>>> As I said I have concern about this.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>>
>>>> *copySwap was tested using copySwap test in jdk dir(with different buffer sizes)
>>>>
>>>> newly added test tested on linux(aarch64,arm32,x86(32/64)) and platforms in mach5. + mach(1-5)
>>>>
>>>> ppc test is not done.
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>>
>>>> On 12/04/19 9:14 PM, Vladimir Kozlov wrote:
>>>>> Hi Jamsheed
>>>>>
>>>>> I think new methods and the table don't need to be part of StubRoutines class. Your new class is visible in all 
>>>>> places too. You can move methods and table into new class. Then you don't need long names and new code will be in 
>>>>> one place.
>>>>>
>>>>> Vladimir
>>>>>
>>>>> On 4/11/19 11:00 PM, Jamsheed wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>> On 12/04/19 12:20 AM, Vladimir Kozlov wrote:
>>>>>>> On 4/11/19 10:25 AM, Jamsheed wrote:
>>>>>>>> Hi Vladimir,
>>>>>>>>
>>>>>>>> the runtime calls uses indirect call, and it is not that straight forward to compare dst i guess.
>>>>>>>
>>>>>>> Okay. Yes, we may load dest into register since it is considered far call (outside CodeCache).
>>>>>>> But at least you can find nmethod. So we can do nm->has_unsafe_access() check.
>>>>>>>
>>>>>>>>
>>>>>>>> will use doing_unsafe_access and error table as you suggested, doing_unsafe_access for intrinsic call doesn't 
>>>>>>>> require volatile semantics in c2 i believe.
>>>>>>>
>>>>>>> Yes, we don't need fragile frame walking if we use doing_unsafe_access.
>>>>>>>
>>>>>>> There is MemBarCPUOrder already in inline_unsafe_copyMemory() which will prevent store instruction moves in 
>>>>>>> generated code. But it does not prevent CPU (Aarch64?) to schedule store in different place.
>>>>>>>
>>>>>>> On other hand we need to read it in Signal handle. I would assume all stores should be flushed when we hit SEGV.
>>>>>> yes
>>>>>>>
>>>>>>> I thought about avoiding your error table. But you really need continuation PC for graceful return.
>>>>>>> I was thinking to have a new separate stub to restore registers and pop frame. But return code in stubs varies 
>>>>>>> unfortunately. So we need a table.
>>>>>>>
>>>>>>> One complain about table is its name too long. And it should be unsafe_copymemory to hint intrinsic. Can it be 
>>>>>>> unsafe_copymemory_error and UnsafeCopyMemoryError class.
>>>>>>> StubRoutines:: is_unsafe_copymemory() and next_pc_for_unsafe_copymemory_error()
>>>>>> yes
>>>>>>>
>>>>>>> I did not get why you providing next PC only for 64 bit VM.
>>>>>>
>>>>>> next_pc is calculated for all case(both 32 bit and 64 bit). this should work for c2-intrisics at-least except for 
>>>>>> arm.
>>>>>>
>>>>>> fast exit is implemented only for x64, as of now.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> all code whose behavior is unpredictable will be removed. like arm int/c1, non intrinsic c2 (And other platforms).
>>>>>>>
>>>>>>> Okay.
>>>>>>
>>>>>> so i am planning to remove graceful exit for all unpredictable cases. so old behavior will be seen if there is an 
>>>>>> exit at startup(SIGBUS crash).
>>>>>>
>>>>>> and steady state use will be mostly c2 intrinsic and will have graceful exit.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Jamsheed
>>>>>>>>
>>>>>>>> On 11/04/19 5:17 AM, Jamsheed wrote:
>>>>>>>>> Hi Vladimir,
>>>>>>>>>
>>>>>>>>> On 10/04/19 10:03 PM, Vladimir Kozlov wrote:
>>>>>>>>>> Okay, I see what you did. But it seems incomplete. You did not set continue_pc for some platforms. Why?
>>>>>>>>> for some platform i don't have testing setup, others are not very much used in servers(32 bit case).
>>>>>>>>>>
>>>>>>>>>> Also this will trigger the code if we hit segv for normal arraycopy. You may want to lookup caller frame to 
>>>>>>>>>> get address from call instruction and compare it with _unsafe_arraycopy:
>>>>>>>>>>
>>>>>>>>>> if (StubCodeDesc::desc_for(pc)) {
>>>>>>>>>>   frame fr = os::fetch_frame_from_context(uc);
>>>>>>>>>>   address ret_pc = fr.sender_pc();
>>>>>>>>>>   CodeBlob* cb = CodeCache::find_blob_unsafe(ret_pc);
>>>>>>>>>>   CompiledMethod* nm = (cb != NULL) ? cb->as_compiled_method_or_null() : NULL;
>>>>>>>>>>   if (nm != NULL && NativeCall::is_call_before(ret_pc)) {
>>>>>>>>>>     address dest = nativeCall_before(ret_pc)->destination();
>>>>>>>>>>     if (dest == StubRoutines::_unsafe_arraycopy) {
>>>>>>>>>>
>>>>>>>>>> But you need to verify if it works.
>>>>>>>>>
>>>>>>>>> this should work i guess.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Jamsheed
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>> On 4/9/19 8:08 PM, Jamsheed wrote:
>>>>>>>>>>> Hi Vladimir,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for looking at this.
>>>>>>>>>>>
>>>>>>>>>>> On 10/04/19 4:01 AM, Vladimir Kozlov wrote:
>>>>>>>>>>>> Hi Jamsheed,
>>>>>>>>>>>>
>>>>>>>>>>>> Instead of finding PC in stubs we should use something similar to GuardUnsafeAccess to set thread's 
>>>>>>>>>>>> doing_unsafe_access flag when we call copy stub for unsafe memory as you suggested first (in bug report).
>>>>>>>>>>>>
>>>>>>>>>>>> Interpreter set the flag for Unsafe.CopyMemory0() and Unsafe.copySwapMemory0(). C2 has intrinsic only for 
>>>>>>>>>>>> CopyMemory0():
>>>>>>>>>>>>
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/6ad0281a654e/src/hotspot/share/opto/library_call.cpp#l4189
>>>>>>>>>>>>
>>>>>>>>>>>> It only use unsafe_arraycopy stab:
>>>>>>>>>>>>
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/6ad0281a654e/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#l2434
>>>>>>>>>>>>
>>>>>>>>>>>> Setting on it's entry and cleaning on exit Thread::_doing_unsafe_access field should be enough. Right?
>>>>>>>>>>>>
>>>>>>>>>>>> Or I am missing something?
>>>>>>>>>>>
>>>>>>>>>>> initially thought of implementing it that way[1], but as it is having both store and volatile semantics went 
>>>>>>>>>>> with this zero overhead solution.
>>>>>>>>>>>
>>>>>>>>>>> also, that doesn't provide me  continuation pc, which is required for fast exit for bulkaccess or even for 
>>>>>>>>>>> graceful exit in platform like arm.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> An other thing which bothering me is Harold's comment:
>>>>>>>>>>>>
>>>>>>>>>>>> "Unfortunately, "CompiledMethod* nm" gets set to NULL and so handle_unsafe_access() is not executed."
>>>>>>>>>>>>
>>>>>>>>>>>> Where/why nm is NULLed?
>>>>>>>>>>> as we are in BufferBlob/RuntimeBlob(stub frame).
>>>>>>>>>>>>
>>>>>>>>>>>> Actually I think the whole code for "// BugId 4454115" is questionable since it replaces any crash (most 
>>>>>>>>>>>> likely not related to unsafe access) in compiled method which has at least one unsafe access with exception. 
>>>>>>>>>>>> May be we should use PcDesc to record unsafe instructions and compare with SEGV PC. But it is separate RFE. 
>>>>>>>>>>>> For this one we need to fix only Unsafe.CopyMemory0() C2 inrinsic.
>>>>>>>>>>>
>>>>>>>>>>> Right, Ok.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> Jamsheed
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>> [1]http://cr.openjdk.java.net/~jcm/8191278/webrev.00/src/hotspot/share/opto/library_call.cpp.udiff.html
>>>>>>>>>>>> On 4/8/19 4:21 AM, Tobias Hartmann wrote:
>>>>>>>>>>>>> Hi Jamsheed,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 05.04.19 15:11, Jamsheed wrote:
>>>>>>>>>>>>>> On 04/04/19 7:23 PM, Andrew Haley wrote:
>>>>>>>>>>>>>>>> this looks reasonable to me although the code is getting quite complicated to handle this edge case.
>>>>>>>>>>>>>>> Yeah, it really is. Can't we just assume that *any* fault in these stubs is
>>>>>>>>>>>>>>> caused via Unsafe, and get rid of bool unsafe_copy_code_range?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the feedback Tobias, Andrew, removed unsafe_copy_code_range.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~jcm/8191278/webrev.04/
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think what Andrew meant is basically to go with webrev.01:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~jcm/8191278/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Right?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Tobias
>>>>>>>>>>>>>