RFR 8191278: MappedByteBuffer bulk access memory failures are not handled gracefully
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue May 28 20:32:02 UTC 2019
On 4/30/19 11:17 PM, Jamsheed wrote:
> Hi Vladimir,
>
> Thank you for all the feedback.
>
> please find the revised webrev.
>
> http://cr.openjdk.java.net/~jcm/8191278/webrev.05/
General comment:
- Way back on 2019.03.01 Andrew Haley asked about performance testing.
I can't find a clear answer that says what performance testing was
done, when and on which version of the patch.
- Please double check copyright years on the files before you push;
I spotted these (but I didn't check all of them):
src/hotspot/cpu/x86/assembler_x86.cpp
src/hotspot/share/opto/library_call.cpp
- This is only a partial review. I didn't review non-Oracle platform
or OS code.
src/hotspot/share/runtime/stubRoutines.hpp
L371 - nit - please delete extra blank line
src/hotspot/share/runtime/stubRoutines.cpp
No comments.
src/hotspot/share/runtime/thread.hpp
No comments.
src/hotspot/share/prims/unsafe.cpp
L160: // been truncated and is now invalid
nit - please add '.' to end the sentence (came from the
original, moved code).
src/hotspot/share/opto/library_call.cpp
In src/hotspot/share/runtime/thread.hpp:
L1090: volatile bool _doing_unsafe_access;
L4220: store_to_memory(control(), doing_unsafe_access_addr,
intcon(1), T_BYTE, Compile::AliasIdxRaw, MemNode::unordered);
L4230: store_to_memory(control(), doing_unsafe_access_addr,
intcon(0), T_BYTE, Compile::AliasIdxRaw, MemNode::unordered);
Is T_BYTE a safe assumption (size wise) for writing to a C++ bool?
src/hotspot/cpu/sparc/stubGenerator_sparc.cpp
L5867: UnsafeCopyMemory::create_table(8);
Should the literal '8' be a defined value from somewhere?
src/hotspot/cpu/x86/assembler_x86.cpp
L4191: NOT_LP64(assert(VM_Version::supports_sse2(), ""));
L4310: NOT_LP64(assert(VM_Version::supports_sse2(), ""));
assert() string should not be empty.
src/hotspot/cpu/x86/assembler_x86.hpp
No comments.
src/hotspot/cpu/x86/stubGenerator_x86_32.cpp
L3919: UnsafeCopyMemory::create_table(8);
Should the literal '8' be a defined value from somewhere?
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp
L1505-08 - nit - indent is 2 spaces too much
L6232: UnsafeCopyMemory::create_table(32);
Should the literal '32' be a defined value from somewhere?
Note: This is a mind numbing amount of assembly code changes.
It's hard to say that I've reviewed and truly understand
all the changes.
src/hotspot/cpu/x86/stubRoutines_x86.hpp
No comments.
src/hotspot/os/windows/os_windows.cpp
L2565: bool is_unsafe_arraycopy = (_thread_in_native ||
in_java) && UnsafeCopyMemory::contains_pc(pc);
I'm not understanding the use of '_thread_in_native' as a flag
instead of a comparator against thread->thread_state(). Should
that part be this:
thread->thread_state() == _thread_in_native
That would match with how the 'in_java' var is set:
L2457: bool in_java = thread->thread_state() ==
_thread_in_Java;
src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp
L280: nit - please delete extra blank line
src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp
No comments.
src/hotspot/os_cpu/solaris_sparc/os_solaris_sparc.cpp
No comments.
src/hotspot/os_cpu/solaris_x86/os_solaris_x86.cpp
No comments.
test/hotspot/jtreg/runtime/Unsafe/InternalErrorTest.java
L31: * @library /test/lib
L32: * @build sun.hotspot.WhiteBox
Not sure why the '@build' is indented.
L42: import java.lang.reflect.Field;
L48: import java.lang.reflect.Method;
L48 should be after L42.
L95: } catch (InternalError e) {
L99: }
nit - L99 indent is 2 spaces off
L96: if (!e.getMessage().contains("fault occurred
in a recent unsafe memory access")) {
L111: if (!e.getMessage().contains("fault occurred
in a recent unsafe memory access")) {
L126: if (!e.getMessage().contains("fault occurred
in a recent unsafe memory access")) {
These literal strings should be replace by a single
final string.
L102: Method m =
InternalErrorTest.class.getMethod("test",MappedByteBuffer.class,
Unsafe.class, long.class, long.class, int.class);
L117: m =
InternalErrorTest.class.getMethod("test",MappedByteBuffer.class,
Unsafe.class, long.class, long.class, int.class);
nit - need space after '"test",'
L135: public static void test(MappedByteBuffer buffer, Unsafe
unsafe, long mapAddr, long allocMem, int type ) {
nit - extra spaces before ')'
L138: buffer.get(new byte[8]);
L141: unsafe.copySwapMemory(null, mapAddr +
pageSize, new byte[4000], 16, 2000, 2);
L144: unsafe.copySwapMemory(null, mapAddr +
pageSize, null, allocMem, 2000, 2);
A short comment above each of these 'tests' would help the
reader understand what you are testing.
I think the only possible significant issue here is in
src/hotspot/os/windows/os_windows.cpp above.
I did not review these files:
src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp
src/hotspot/cpu/arm/stubGenerator_arm.cpp
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp
src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp
src/hotspot/os_cpu/bsd_zero/os_bsd_zero.cpp
src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp
src/hotspot/os_cpu/linux_arm/os_linux_arm.cpp
src/hotspot/os_cpu/linux_ppc/os_linux_ppc.cpp
src/hotspot/os_cpu/linux_s390/os_linux_s390.cpp
src/hotspot/os_cpu/linux_sparc/os_linux_sparc.cpp
src/hotspot/os_cpu/linux_zero/os_linux_zero.cpp
Dan
>
> what changed
>
> 1) Unsafe Copy Memory regions, error exits are captured using
> UnsafeCopyMemoryMark, UnsafeCopyMemory.
>
> 2) all Unsafe copy (intrinsic or native) ,uses array copy stub.
>
> 3) Unsafe copyswap x64 implementation extends arraycopy stub and used
> as reliable exit(fast exit too)*. it is not implemented for other
> platforms.
>
> *copySwap was tested using copySwap test in jdk dir(with different
> buffer sizes)
>
> newly added test tested on linux(aarch64,arm32,x86(32/64)) and
> platforms in mach5. + mach(1-5)
>
> ppc test is not done.
>
> Best regards,
>
> Jamsheed
>
>
> On 12/04/19 9:14 PM, Vladimir Kozlov wrote:
>> Hi Jamsheed
>>
>> I think new methods and the table don't need to be part of
>> StubRoutines class. Your new class is visible in all places too. You
>> can move methods and table into new class. Then you don't need long
>> names and new code will be in one place.
>>
>> Vladimir
>>
>> On 4/11/19 11:00 PM, Jamsheed wrote:
>>> Hi Vladimir,
>>>
>>> On 12/04/19 12:20 AM, Vladimir Kozlov wrote:
>>>> On 4/11/19 10:25 AM, Jamsheed wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>> the runtime calls uses indirect call, and it is not that straight
>>>>> forward to compare dst i guess.
>>>>
>>>> Okay. Yes, we may load dest into register since it is considered
>>>> far call (outside CodeCache).
>>>> But at least you can find nmethod. So we can do
>>>> nm->has_unsafe_access() check.
>>>>
>>>>>
>>>>> will use doing_unsafe_access and error table as you suggested,
>>>>> doing_unsafe_access for intrinsic call doesn't require volatile
>>>>> semantics in c2 i believe.
>>>>
>>>> Yes, we don't need fragile frame walking if we use
>>>> doing_unsafe_access.
>>>>
>>>> There is MemBarCPUOrder already in inline_unsafe_copyMemory() which
>>>> will prevent store instruction moves in generated code. But it does
>>>> not prevent CPU (Aarch64?) to schedule store in different place.
>>>>
>>>> On other hand we need to read it in Signal handle. I would assume
>>>> all stores should be flushed when we hit SEGV.
>>> yes
>>>>
>>>> I thought about avoiding your error table. But you really need
>>>> continuation PC for graceful return.
>>>> I was thinking to have a new separate stub to restore registers and
>>>> pop frame. But return code in stubs varies unfortunately. So we
>>>> need a table.
>>>>
>>>> One complain about table is its name too long. And it should be
>>>> unsafe_copymemory to hint intrinsic. Can it be
>>>> unsafe_copymemory_error and UnsafeCopyMemoryError class.
>>>> StubRoutines:: is_unsafe_copymemory() and
>>>> next_pc_for_unsafe_copymemory_error()
>>> yes
>>>>
>>>> I did not get why you providing next PC only for 64 bit VM.
>>>
>>> next_pc is calculated for all case(both 32 bit and 64 bit). this
>>> should work for c2-intrisics at-least except for arm.
>>>
>>> fast exit is implemented only for x64, as of now.
>>>
>>>>
>>>>>
>>>>> all code whose behavior is unpredictable will be removed. like arm
>>>>> int/c1, non intrinsic c2 (And other platforms).
>>>>
>>>> Okay.
>>>
>>> so i am planning to remove graceful exit for all unpredictable
>>> cases. so old behavior will be seen if there is an exit at
>>> startup(SIGBUS crash).
>>>
>>> and steady state use will be mostly c2 intrinsic and will have
>>> graceful exit.
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> On 11/04/19 5:17 AM, Jamsheed wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>> On 10/04/19 10:03 PM, Vladimir Kozlov wrote:
>>>>>>> Okay, I see what you did. But it seems incomplete. You did not
>>>>>>> set continue_pc for some platforms. Why?
>>>>>> for some platform i don't have testing setup, others are not very
>>>>>> much used in servers(32 bit case).
>>>>>>>
>>>>>>> Also this will trigger the code if we hit segv for normal
>>>>>>> arraycopy. You may want to lookup caller frame to get address
>>>>>>> from call instruction and compare it with _unsafe_arraycopy:
>>>>>>>
>>>>>>> if (StubCodeDesc::desc_for(pc)) {
>>>>>>> frame fr = os::fetch_frame_from_context(uc);
>>>>>>> address ret_pc = fr.sender_pc();
>>>>>>> CodeBlob* cb = CodeCache::find_blob_unsafe(ret_pc);
>>>>>>> CompiledMethod* nm = (cb != NULL) ?
>>>>>>> cb->as_compiled_method_or_null() : NULL;
>>>>>>> if (nm != NULL && NativeCall::is_call_before(ret_pc)) {
>>>>>>> address dest = nativeCall_before(ret_pc)->destination();
>>>>>>> if (dest == StubRoutines::_unsafe_arraycopy) {
>>>>>>>
>>>>>>> But you need to verify if it works.
>>>>>>
>>>>>> this should work i guess.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 4/9/19 8:08 PM, Jamsheed wrote:
>>>>>>>> Hi Vladimir,
>>>>>>>>
>>>>>>>> Thank you for looking at this.
>>>>>>>>
>>>>>>>> On 10/04/19 4:01 AM, Vladimir Kozlov wrote:
>>>>>>>>> Hi Jamsheed,
>>>>>>>>>
>>>>>>>>> Instead of finding PC in stubs we should use something similar
>>>>>>>>> to GuardUnsafeAccess to set thread's doing_unsafe_access flag
>>>>>>>>> when we call copy stub for unsafe memory as you suggested
>>>>>>>>> first (in bug report).
>>>>>>>>>
>>>>>>>>> Interpreter set the flag for Unsafe.CopyMemory0() and
>>>>>>>>> Unsafe.copySwapMemory0(). C2 has intrinsic only for
>>>>>>>>> CopyMemory0():
>>>>>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/6ad0281a654e/src/hotspot/share/opto/library_call.cpp#l4189
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It only use unsafe_arraycopy stab:
>>>>>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/6ad0281a654e/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#l2434
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Setting on it's entry and cleaning on exit
>>>>>>>>> Thread::_doing_unsafe_access field should be enough. Right?
>>>>>>>>>
>>>>>>>>> Or I am missing something?
>>>>>>>>
>>>>>>>> initially thought of implementing it that way[1], but as it is
>>>>>>>> having both store and volatile semantics went with this zero
>>>>>>>> overhead solution.
>>>>>>>>
>>>>>>>> also, that doesn't provide me continuation pc, which is
>>>>>>>> required for fast exit for bulkaccess or even for graceful
>>>>>>>> exit in platform like arm.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> An other thing which bothering me is Harold's comment:
>>>>>>>>>
>>>>>>>>> "Unfortunately, "CompiledMethod* nm" gets set to NULL and so
>>>>>>>>> handle_unsafe_access() is not executed."
>>>>>>>>>
>>>>>>>>> Where/why nm is NULLed?
>>>>>>>> as we are in BufferBlob/RuntimeBlob(stub frame).
>>>>>>>>>
>>>>>>>>> Actually I think the whole code for "// BugId 4454115" is
>>>>>>>>> questionable since it replaces any crash (most likely not
>>>>>>>>> related to unsafe access) in compiled method which has at
>>>>>>>>> least one unsafe access with exception. May be we should use
>>>>>>>>> PcDesc to record unsafe instructions and compare with SEGV PC.
>>>>>>>>> But it is separate RFE. For this one we need to fix only
>>>>>>>>> Unsafe.CopyMemory0() C2 inrinsic.
>>>>>>>>
>>>>>>>> Right, Ok.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Jamsheed
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>> [1]http://cr.openjdk.java.net/~jcm/8191278/webrev.00/src/hotspot/share/opto/library_call.cpp.udiff.html
>>>>>>>>
>>>>>>>>> On 4/8/19 4:21 AM, Tobias Hartmann wrote:
>>>>>>>>>> Hi Jamsheed,
>>>>>>>>>>
>>>>>>>>>> On 05.04.19 15:11, Jamsheed wrote:
>>>>>>>>>>> On 04/04/19 7:23 PM, Andrew Haley wrote:
>>>>>>>>>>>>> this looks reasonable to me although the code is getting
>>>>>>>>>>>>> quite complicated to handle this edge case.
>>>>>>>>>>>> Yeah, it really is. Can't we just assume that *any* fault
>>>>>>>>>>>> in these stubs is
>>>>>>>>>>>> caused via Unsafe, and get rid of bool unsafe_copy_code_range?
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the feedback Tobias, Andrew, removed
>>>>>>>>>>> unsafe_copy_code_range.
>>>>>>>>>>>
>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~jcm/8191278/webrev.04/
>>>>>>>>>>
>>>>>>>>>> I think what Andrew meant is basically to go with webrev.01:
>>>>>>>>>> http://cr.openjdk.java.net/~jcm/8191278/webrev.01/
>>>>>>>>>>
>>>>>>>>>> Right?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Tobias
>>>>>>>>>>
>
More information about the hotspot-dev
mailing list