RFR 8191278: MappedByteBuffer bulk access memory failures are not handled gracefully
Jamsheed
jamsheed.c.m at oracle.com
Fri May 31 17:04:31 UTC 2019
Hi Dan,
Thanks for the review and the feedback, revised webrev (incremental from
webrev.05) http://cr.openjdk.java.net/~jcm/8191278/webrev.06_1/
On 29/05/19 2:02 AM, Daniel D. Daugherty wrote:
> On 4/30/19 11:17 PM, Jamsheed wrote:
>> Hi Vladimir,
>>
>> Thank you for all the feedback.
>>
>> please find the revised webrev.
>>
>> http://cr.openjdk.java.net/~jcm/8191278/webrev.05/
>
> General comment:
>
> - Way back on 2019.03.01 Andrew Haley asked about performance testing.
> I can't find a clear answer that says what performance testing was
> done, when and on which version of the patch.
it is not being done. i will do it once patch is finalized.
>
> - Please double check copyright years on the files before you push;
> I spotted these (but I didn't check all of them):
> src/hotspot/cpu/x86/assembler_x86.cpp
> src/hotspot/share/opto/library_call.cpp
Done.
>
> - This is only a partial review. I didn't review non-Oracle platform
> or OS code.
Ok.
>
> src/hotspot/share/runtime/stubRoutines.hpp
> L371 - nit - please delete extra blank line
Done
>
> src/hotspot/share/runtime/stubRoutines.cpp
> No comments.
>
> src/hotspot/share/runtime/thread.hpp
> No comments.
>
> src/hotspot/share/prims/unsafe.cpp
> L160: // been truncated and is now invalid
> nit - please add '.' to end the sentence (came from the
> original, moved code).
Done.
>
> src/hotspot/share/opto/library_call.cpp
> In src/hotspot/share/runtime/thread.hpp:
> L1090: volatile bool _doing_unsafe_access;
>
> L4220: store_to_memory(control(), doing_unsafe_access_addr,
> intcon(1), T_BYTE, Compile::AliasIdxRaw, MemNode::unordered);
> L4230: store_to_memory(control(), doing_unsafe_access_addr,
> intcon(0), T_BYTE, Compile::AliasIdxRaw, MemNode::unordered);
> Is T_BYTE a safe assumption (size wise) for writing to a C++
> bool?
Thanks for pointing this.
hope [1] will work ?
>
>
> src/hotspot/cpu/sparc/stubGenerator_sparc.cpp
> L5867: UnsafeCopyMemory::create_table(8);
> Should the literal '8' be a defined value from somewhere?
Done. used a new #define.
>
> src/hotspot/cpu/x86/assembler_x86.cpp
> L4191: NOT_LP64(assert(VM_Version::supports_sse2(), ""));
> L4310: NOT_LP64(assert(VM_Version::supports_sse2(), ""));
> assert() string should not be empty.
Done.
>
> src/hotspot/cpu/x86/assembler_x86.hpp
> No comments.
>
> src/hotspot/cpu/x86/stubGenerator_x86_32.cpp
> L3919: UnsafeCopyMemory::create_table(8);
> Should the literal '8' be a defined value from somewhere?
Done used a new #define.
>
> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp
> L1505-08 - nit - indent is 2 spaces too much
Done.
>
> L6232: UnsafeCopyMemory::create_table(32);
> Should the literal '32' be a defined value from somewhere?
Done used a new #define.
>
> Note: This is a mind numbing amount of assembly code changes.
> It's hard to say that I've reviewed and truly understand
> all the changes.
>
> src/hotspot/cpu/x86/stubRoutines_x86.hpp
> No comments.
>
> src/hotspot/os/windows/os_windows.cpp
> L2565: bool is_unsafe_arraycopy = (_thread_in_native ||
> in_java) && UnsafeCopyMemory::contains_pc(pc);
> I'm not understanding the use of '_thread_in_native' as a flag
> instead of a comparator against thread->thread_state(). Should
> that part be this:
>
> thread->thread_state() == _thread_in_native
>
> That would match with how the 'in_java' var is set:
>
> L2457: bool in_java = thread->thread_state() ==
> _thread_in_Java;
Thanks for pointing this. Corrected.
>
> src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp
> L280: nit - please delete extra blank line
Done.
>
> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp
> No comments.
>
> src/hotspot/os_cpu/solaris_sparc/os_solaris_sparc.cpp
> No comments.
>
> src/hotspot/os_cpu/solaris_x86/os_solaris_x86.cpp
> No comments.
>
> test/hotspot/jtreg/runtime/Unsafe/InternalErrorTest.java
> L31: * @library /test/lib
> L32: * @build sun.hotspot.WhiteBox
> Not sure why the '@build' is indented.
>
> L42: import java.lang.reflect.Field;
> L48: import java.lang.reflect.Method;
> L48 should be after L42.
>
> L95: } catch (InternalError e) {
> L99: }
> nit - L99 indent is 2 spaces off
>
> L96: if (!e.getMessage().contains("fault occurred
> in a recent unsafe memory access")) {
> L111: if (!e.getMessage().contains("fault occurred
> in a recent unsafe memory access")) {
> L126: if (!e.getMessage().contains("fault occurred
> in a recent unsafe memory access")) {
> These literal strings should be replace by a single
> final string.
>
> L102: Method m =
> InternalErrorTest.class.getMethod("test",MappedByteBuffer.class,
> Unsafe.class, long.class, long.class, int.class);
> L117: m =
> InternalErrorTest.class.getMethod("test",MappedByteBuffer.class,
> Unsafe.class, long.class, long.class, int.class);
> nit - need space after '"test",'
>
> L135: public static void test(MappedByteBuffer buffer, Unsafe
> unsafe, long mapAddr, long allocMem, int type ) {
> nit - extra spaces before ')'
>
> L138: buffer.get(new byte[8]);
> L141: unsafe.copySwapMemory(null, mapAddr +
> pageSize, new byte[4000], 16, 2000, 2);
> L144: unsafe.copySwapMemory(null, mapAddr +
> pageSize, null, allocMem, 2000, 2);
> A short comment above each of these 'tests' would help the
> reader understand what you are testing.
Done.
>
> I think the only possible significant issue here is in
> src/hotspot/os/windows/os_windows.cpp above.
>
> I did not review these files:
>
> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp
> src/hotspot/cpu/arm/stubGenerator_arm.cpp
> src/hotspot/cpu/ppc/stubGenerator_ppc.cpp
> src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp
> src/hotspot/os_cpu/bsd_zero/os_bsd_zero.cpp
> src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp
> src/hotspot/os_cpu/linux_arm/os_linux_arm.cpp
> src/hotspot/os_cpu/linux_ppc/os_linux_ppc.cpp
> src/hotspot/os_cpu/linux_s390/os_linux_s390.cpp
> src/hotspot/os_cpu/linux_sparc/os_linux_sparc.cpp
> src/hotspot/os_cpu/linux_zero/os_linux_zero.cpp
>
>
> Dan
>
[1]
+ BasicType doing_unsafe_access_bt = T_BYTE;
+ switch(sizeof(bool)) {
+ case 1: doing_unsafe_access_bt = T_BYTE; break;
+ case 2: doing_unsafe_access_bt = T_SHORT; break;
+ case 4: doing_unsafe_access_bt = T_INT; break;
+ case 8: doing_unsafe_access_bt = T_LONG; break;
+ default: ShouldNotReachHere();
+ }
// update volatile field
- store_to_memory(control(), doing_unsafe_access_addr, intcon(1),
T_BYTE, Compile::AliasIdxRaw, MemNode::unordered);
+ store_to_memory(control(), doing_unsafe_access_addr, intcon(1),
doing_unsafe_access_bt, Compile::AliasIdxRaw, MemNode::unordered);
// Call it. Note that the length argument is not scaled.
make_runtime_call(RC_LEAF|RC_NO_FP,
OptoRuntime::fast_arraycopy_Type(),
StubRoutines::unsafe_arraycopy(),
"unsafe_arraycopy",
TypeRawPtr::BOTTOM,
src, dst, size XTOP);
- store_to_memory(control(), doing_unsafe_access_addr, intcon(0),
T_BYTE, Compile::AliasIdxRaw, MemNode::unordered);
+ store_to_memory(control(), doing_unsafe_access_addr, intcon(0),
doing_unsafe_access_bt, Compile::AliasIdxRaw, MemNode::unordered);
>
>>
>> what changed
>>
>> 1) Unsafe Copy Memory regions, error exits are captured using
>> UnsafeCopyMemoryMark, UnsafeCopyMemory.
>>
>> 2) all Unsafe copy (intrinsic or native) ,uses array copy stub.
>>
>> 3) Unsafe copyswap x64 implementation extends arraycopy stub and used
>> as reliable exit(fast exit too)*. it is not implemented for other
>> platforms.
>>
>> *copySwap was tested using copySwap test in jdk dir(with different
>> buffer sizes)
>>
>> newly added test tested on linux(aarch64,arm32,x86(32/64)) and
>> platforms in mach5. + mach(1-5)
>>
>> ppc test is not done.
>>
>> Best regards,
>>
>> Jamsheed
>>
>>
>> On 12/04/19 9:14 PM, Vladimir Kozlov wrote:
>>> Hi Jamsheed
>>>
>>> I think new methods and the table don't need to be part of
>>> StubRoutines class. Your new class is visible in all places too. You
>>> can move methods and table into new class. Then you don't need long
>>> names and new code will be in one place.
>>>
>>> Vladimir
>>>
>>> On 4/11/19 11:00 PM, Jamsheed wrote:
>>>> Hi Vladimir,
>>>>
>>>> On 12/04/19 12:20 AM, Vladimir Kozlov wrote:
>>>>> On 4/11/19 10:25 AM, Jamsheed wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>> the runtime calls uses indirect call, and it is not that straight
>>>>>> forward to compare dst i guess.
>>>>>
>>>>> Okay. Yes, we may load dest into register since it is considered
>>>>> far call (outside CodeCache).
>>>>> But at least you can find nmethod. So we can do
>>>>> nm->has_unsafe_access() check.
>>>>>
>>>>>>
>>>>>> will use doing_unsafe_access and error table as you suggested,
>>>>>> doing_unsafe_access for intrinsic call doesn't require volatile
>>>>>> semantics in c2 i believe.
>>>>>
>>>>> Yes, we don't need fragile frame walking if we use
>>>>> doing_unsafe_access.
>>>>>
>>>>> There is MemBarCPUOrder already in inline_unsafe_copyMemory()
>>>>> which will prevent store instruction moves in generated code. But
>>>>> it does not prevent CPU (Aarch64?) to schedule store in different
>>>>> place.
>>>>>
>>>>> On other hand we need to read it in Signal handle. I would assume
>>>>> all stores should be flushed when we hit SEGV.
>>>> yes
>>>>>
>>>>> I thought about avoiding your error table. But you really need
>>>>> continuation PC for graceful return.
>>>>> I was thinking to have a new separate stub to restore registers
>>>>> and pop frame. But return code in stubs varies unfortunately. So
>>>>> we need a table.
>>>>>
>>>>> One complain about table is its name too long. And it should be
>>>>> unsafe_copymemory to hint intrinsic. Can it be
>>>>> unsafe_copymemory_error and UnsafeCopyMemoryError class.
>>>>> StubRoutines:: is_unsafe_copymemory() and
>>>>> next_pc_for_unsafe_copymemory_error()
>>>> yes
>>>>>
>>>>> I did not get why you providing next PC only for 64 bit VM.
>>>>
>>>> next_pc is calculated for all case(both 32 bit and 64 bit). this
>>>> should work for c2-intrisics at-least except for arm.
>>>>
>>>> fast exit is implemented only for x64, as of now.
>>>>
>>>>>
>>>>>>
>>>>>> all code whose behavior is unpredictable will be removed. like
>>>>>> arm int/c1, non intrinsic c2 (And other platforms).
>>>>>
>>>>> Okay.
>>>>
>>>> so i am planning to remove graceful exit for all unpredictable
>>>> cases. so old behavior will be seen if there is an exit at
>>>> startup(SIGBUS crash).
>>>>
>>>> and steady state use will be mostly c2 intrinsic and will have
>>>> graceful exit.
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>> On 11/04/19 5:17 AM, Jamsheed wrote:
>>>>>>> Hi Vladimir,
>>>>>>>
>>>>>>> On 10/04/19 10:03 PM, Vladimir Kozlov wrote:
>>>>>>>> Okay, I see what you did. But it seems incomplete. You did not
>>>>>>>> set continue_pc for some platforms. Why?
>>>>>>> for some platform i don't have testing setup, others are not
>>>>>>> very much used in servers(32 bit case).
>>>>>>>>
>>>>>>>> Also this will trigger the code if we hit segv for normal
>>>>>>>> arraycopy. You may want to lookup caller frame to get address
>>>>>>>> from call instruction and compare it with _unsafe_arraycopy:
>>>>>>>>
>>>>>>>> if (StubCodeDesc::desc_for(pc)) {
>>>>>>>> frame fr = os::fetch_frame_from_context(uc);
>>>>>>>> address ret_pc = fr.sender_pc();
>>>>>>>> CodeBlob* cb = CodeCache::find_blob_unsafe(ret_pc);
>>>>>>>> CompiledMethod* nm = (cb != NULL) ?
>>>>>>>> cb->as_compiled_method_or_null() : NULL;
>>>>>>>> if (nm != NULL && NativeCall::is_call_before(ret_pc)) {
>>>>>>>> address dest = nativeCall_before(ret_pc)->destination();
>>>>>>>> if (dest == StubRoutines::_unsafe_arraycopy) {
>>>>>>>>
>>>>>>>> But you need to verify if it works.
>>>>>>>
>>>>>>> this should work i guess.
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Jamsheed
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 4/9/19 8:08 PM, Jamsheed wrote:
>>>>>>>>> Hi Vladimir,
>>>>>>>>>
>>>>>>>>> Thank you for looking at this.
>>>>>>>>>
>>>>>>>>> On 10/04/19 4:01 AM, Vladimir Kozlov wrote:
>>>>>>>>>> Hi Jamsheed,
>>>>>>>>>>
>>>>>>>>>> Instead of finding PC in stubs we should use something
>>>>>>>>>> similar to GuardUnsafeAccess to set thread's
>>>>>>>>>> doing_unsafe_access flag when we call copy stub for unsafe
>>>>>>>>>> memory as you suggested first (in bug report).
>>>>>>>>>>
>>>>>>>>>> Interpreter set the flag for Unsafe.CopyMemory0() and
>>>>>>>>>> Unsafe.copySwapMemory0(). C2 has intrinsic only for
>>>>>>>>>> CopyMemory0():
>>>>>>>>>>
>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/6ad0281a654e/src/hotspot/share/opto/library_call.cpp#l4189
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> It only use unsafe_arraycopy stab:
>>>>>>>>>>
>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/6ad0281a654e/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#l2434
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Setting on it's entry and cleaning on exit
>>>>>>>>>> Thread::_doing_unsafe_access field should be enough. Right?
>>>>>>>>>>
>>>>>>>>>> Or I am missing something?
>>>>>>>>>
>>>>>>>>> initially thought of implementing it that way[1], but as it is
>>>>>>>>> having both store and volatile semantics went with this zero
>>>>>>>>> overhead solution.
>>>>>>>>>
>>>>>>>>> also, that doesn't provide me continuation pc, which is
>>>>>>>>> required for fast exit for bulkaccess or even for graceful
>>>>>>>>> exit in platform like arm.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> An other thing which bothering me is Harold's comment:
>>>>>>>>>>
>>>>>>>>>> "Unfortunately, "CompiledMethod* nm" gets set to NULL and so
>>>>>>>>>> handle_unsafe_access() is not executed."
>>>>>>>>>>
>>>>>>>>>> Where/why nm is NULLed?
>>>>>>>>> as we are in BufferBlob/RuntimeBlob(stub frame).
>>>>>>>>>>
>>>>>>>>>> Actually I think the whole code for "// BugId 4454115" is
>>>>>>>>>> questionable since it replaces any crash (most likely not
>>>>>>>>>> related to unsafe access) in compiled method which has at
>>>>>>>>>> least one unsafe access with exception. May be we should use
>>>>>>>>>> PcDesc to record unsafe instructions and compare with SEGV
>>>>>>>>>> PC. But it is separate RFE. For this one we need to fix only
>>>>>>>>>> Unsafe.CopyMemory0() C2 inrinsic.
>>>>>>>>>
>>>>>>>>> Right, Ok.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Jamsheed
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>> [1]http://cr.openjdk.java.net/~jcm/8191278/webrev.00/src/hotspot/share/opto/library_call.cpp.udiff.html
>>>>>>>>>
>>>>>>>>>> On 4/8/19 4:21 AM, Tobias Hartmann wrote:
>>>>>>>>>>> Hi Jamsheed,
>>>>>>>>>>>
>>>>>>>>>>> On 05.04.19 15:11, Jamsheed wrote:
>>>>>>>>>>>> On 04/04/19 7:23 PM, Andrew Haley wrote:
>>>>>>>>>>>>>> this looks reasonable to me although the code is getting
>>>>>>>>>>>>>> quite complicated to handle this edge case.
>>>>>>>>>>>>> Yeah, it really is. Can't we just assume that *any* fault
>>>>>>>>>>>>> in these stubs is
>>>>>>>>>>>>> caused via Unsafe, and get rid of bool
>>>>>>>>>>>>> unsafe_copy_code_range?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the feedback Tobias, Andrew, removed
>>>>>>>>>>>> unsafe_copy_code_range.
>>>>>>>>>>>>
>>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~jcm/8191278/webrev.04/
>>>>>>>>>>>
>>>>>>>>>>> I think what Andrew meant is basically to go with webrev.01:
>>>>>>>>>>> http://cr.openjdk.java.net/~jcm/8191278/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Right?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Tobias
>>>>>>>>>>>
>>
>
More information about the hotspot-dev
mailing list