BUG: withOwnerThread closes MappedMemorySegment

Jorn Vernee jorn.vernee at oracle.com
Fri Sep 25 15:00:05 UTC 2020


Hi Ty,

I'm trying to reproduce your crash locally, but no luck so far. From the 
error message, it seems like it might be caused by a thread being 
suspended while inside a native call, which triggers a slow path that 
seems to have a bug in it; not setting the thread state to 
_thread_in_java again.

The following patch fixes that mistake [1], but since I'm not able so 
far to reproduce the crash I can't say for sure that that will fix the 
problem you are seeing.

It would help if you could provide some steps to reproduce, and/or try 
out the patch at [1] if you can (should be enough to put it in a file 
and then use `git apply <file>`).

I'm also curious to know if you are running with the default GC, or 
maybe another one?

Thanks,
Jorn

[1] :

diff --git a/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp 
b/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
index 51b9082cfae..887369fc5c8 100644
--- a/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
+++ b/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
@@ -3810,11 +3810,12 @@ void NativeInvokerGenerator::generate() {
    __ safepoint_poll(L_safepoint_poll_slow_path, r15_thread, rscratch1);
    __ cmpl(Address(r15_thread, JavaThread::suspend_flags_offset()), 0);
    __ jcc(Assembler::notEqual, L_safepoint_poll_slow_path);
-  // change thread state
-  __ movl(Address(r15_thread, JavaThread::thread_state_offset()), 
_thread_in_Java);

    __ bind(L_after_safepoint_poll);

+  // change thread state
+  __ movl(Address(r15_thread, JavaThread::thread_state_offset()), 
_thread_in_Java);
+
    __ block_comment("reguard stack check");
    Label L_reguard;
    Label L_after_reguard

On 25/09/2020 15:54, Maurizio Cimadamore wrote:
>
> On 25/09/2020 14:33, Ty Young wrote:
>>
>> On 9/25/20 4:58 AM, Maurizio Cimadamore wrote:
>>> One idea - we have recently turned on the method handle intrinsics - 
>>> try running with these parameters:
>>>
>>> -Djdk.internal.foreign.ProgrammableInvoker.USE_SPEC=false 
>>> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=false
>>>
>>> And see if the crash still happens.
>>
>>
>> Seems like that is the issue. I can't say 100% because there are 
>> run-to-run outliers where it'll sometimes get past the 6-8 minute 
>> mark and run for awhile, only to eventually crash. Nothing yet with 
>> those turned off though.
>
> Ok, on our hand, it seems like Jorn has identified a potential culprit 
> on the intrinsics...
>
> I'd suggest to run with intrinsics disabled for the time being.
>
> Thanks
> Maurizio
>
>>
>>
>>>
>>> Maurizio
>>>
>>> On 25/09/2020 01:46, Ty Young wrote:
>>>>
>>>> On 9/24/20 4:56 AM, Maurizio Cimadamore wrote:
>>>>>
>>>>> On 24/09/2020 04:01, Ty Young wrote:
>>>>>>> There are two main issues with "just mutating the existining 
>>>>>>> segments":
>>>>>>>
>>>>>>> 1) First and foremost, C2 likes constants. So, if the thread is 
>>>>>>> constant on the segment we pay nothing for the ownership (or 
>>>>>>> lack of ownership) check.
>>>>>>>
>>>>>>> 2) Second, you can't just "flip" a switch on a shared data 
>>>>>>> structure, and expect the update to be seen by the rest of the 
>>>>>>> world (in a multi-thread scenario). On the other hand, 
>>>>>>> publishing a "new" object has more guarantees on who can do what 
>>>>>>> with that new object (meaning it's physically not possible for 
>>>>>>> other threads to start writing on the new segment before it has 
>>>>>>> been returned by the API)
>>>>>>>
>>>>>>> Note that the original segment is killed (meaning its isAlive 
>>>>>>> returns false) but the new segment still points at the same 
>>>>>>> mappped memory region. So if you want a shared memory segment, 
>>>>>>> you just have to do:
>>>>>>>
>>>>>>> MappedMemorySegment nativeFile = 
>>>>>>> MemorySegment.mapFromPath(file.toPath(), 0, 4096, 
>>>>>>> FileChannel.MapMode.READ_WRITE)
>>>>>>> .withOwnerThread(null);
>>>>>>
>>>>>>
>>>>>> Yeah, figured that'd work. I had to add an if-else to check if a 
>>>>>> segment is already unbound, which got me past that error, 
>>>>>> although my projects still aren't running because of a 
>>>>>> StackOverflow and "java.lang.NoClassDefFoundError: Could not 
>>>>>> initialize class java.lang.StackTraceElement$HashedModules" error 
>>>>>> related to MethodHandles that are being added to struct layouts. 
>>>>>> Not 100% sure what's going on there...
>>>>>
>>>>> That looks odd - if you could send the entire stack trace we could 
>>>>> perhaps quickly glance if it's coming from our API, or your code.
>>>>
>>>>
>>>> Yeah, it was my code, sorry. My application at least runs now... 
>>>> for maybe about 6-8 minutes... until it crashes.
>>>>
>>>>
>>>> Multiple crashes look like this:
>>>>
>>>>
>>>> Current thread (0x00007f17e41a4f30):  JavaThread "pool-2-thread-2" 
>>>> [_thread_in_native_trans, id=2961304, 
>>>> stack(0x00007f17d09a5000,0x00007f17d0aa6000)]
>>>>
>>>> Stack: [0x00007f17d09a5000,0x00007f17d0aa6000], 
>>>> sp=0x00007f17d0aa47b8,  free space=1021k
>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, 
>>>> j=interpreted, Vv=VM code, C=native code)
>>>> J 9169 c2 
>>>> java.lang.invoke.LambdaForm$MH+0x0000000800eb2040.invoke(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)I 
>>>> java.base at 16-internal (89 bytes) @ 0x00007f185c85867a 
>>>> [0x00007f185c858620+0x000000000000005a]
>>>>
>>>> [error occurred during error reporting (printing native stack), id 
>>>> 0xb, SIGSEGV (0xb) at pc=0x00007f1871cc7075]
>>>>
>>>>
>>>> Although a small percentage look like:
>>>>
>>>>
>>>> Current thread (0x00007f300c0c1220):  JavaThread "pool-2-thread-9" 
>>>> [_thread_in_native_trans, id=3014594, 
>>>> stack(0x00007f3001f48000,0x00007f3002049000)]
>>>>
>>>> Stack: [0x00007f3001f48000,0x00007f3002049000], 
>>>> sp=0x00007f3002047798,  free space=1021k
>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, 
>>>> j=interpreted, Vv=VM code, C=native code)
>>>> J 2028 c2 
>>>> jdk.incubator.foreign.ValueLayout.attribute(Ljava/lang/String;)Ljava/util/Optional; 
>>>> jdk.incubator.foreign at 16-internal (6 bytes) @ 0x00007f30984f1872 
>>>> [0x00007f30984f1400+0x0000000000000472]
>>>>
>>>> [error occurred during error reporting (printing native stack), id 
>>>> 0xb, SIGSEGV (0xb) at pc=0x00007f30afa54075]
>>>>
>>>>
>>>> Errors always seem to be in relation to compilation events:
>>>>
>>>>
>>>> Compiled method (c1)  357892 5293  s!   3 
>>>> org.goliath.envious.nvml.local.attributes.performance.limiters.NVMLGPUPerformanceLimitSoftwareThermalSlowdownAttribute::update 
>>>> (63 bytes)
>>>>
>>>>
>>>> Is there any JVM or panama debug arguments I can use to find the 
>>>> issue?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> Cheers
>>>>> Maurizio
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> E.g. use chaining to get to the segment you want to construct.
>>>>>>>
>>>>>>> As for the interaction with asSlice() - a slice is not a true 
>>>>>>> standalone segment - is just another view of the existing 
>>>>>>> segment with different bounds - in fact closing the slice also 
>>>>>>> closes the parent segments. So, calling withOwnerThread will 
>>>>>>> also kill all associated slices (or, calling withCleanupAction 
>>>>>>> will attach the cleanup action to all the segments which share 
>>>>>>> the same temporal bound).
>>>>>>>
>>>>>>> There is, currently, no way to create a segment, and then slice 
>>>>>>> it so that you can N segments each owned by a different thread, 
>>>>>>> if that's what you are looking for. We looked into that and it's 
>>>>>>> way too messy (not only you have to worry about how to "split" 
>>>>>>> the segment, but also about how you merge it back - and that's 
>>>>>>> the hard part). So, if you are not ok with confinement, create a 
>>>>>>> shared segment and work with it (e.g. like it was a ByteBuffer 
>>>>>>> or a Java array). It is up to you then to make sure that 
>>>>>>> multiple threads operate on the segment in a way that makes 
>>>>>>> sense (either by assigning disjoint slices to different threads, 
>>>>>>> e.g. using a spliterator, which the API supports), or by using 
>>>>>>> some synchronization.
>>>>>>
>>>>>>
>>>>>> Figured I'd have to do manual synchronization.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>>
>>>>>>> On 23/09/2020 07:47, sundararajan.athijegannathan at oracle.com wrote:
>>>>>>>> Can you access source?
>>>>>>>>
>>>>>>>> https://urldefense.com/v3/__https://github.com/openjdk/panama-foreign/blob/foreign-jextract/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/MemorySegment.java__;!!GqivPVa7Brio!OlbtDRpO3iKUiqjySSQELP4SjIC_dCXbKMZ344YiK9XD92Rg8n6jxzYxOkrj_IvkZ8Tlf78$ 
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, afaik it is implementation technical reason. (Maurizio 
>>>>>>>> will clarify). That said, it should be possible to map 
>>>>>>>> different file segments to *different* memory segments with 
>>>>>>>> different owners, right? What's the advantage of mapping the 
>>>>>>>> whole & slice to own parts by different threads?
>>>>>>>>
>>>>>>>> -Sundar
>>>>>>>>
>>>>>>>> On 23/09/20 11:02 am, Ty Young wrote:
>>>>>>>>> Javadoc isn't working for withOwnerThread or withCleanupAction 
>>>>>>>>> for me on Netbeans:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://urldefense.com/v3/__https://imgur.com/a/Q21pZAC__;!!GqivPVa7Brio!OlbtDRpO3iKUiqjySSQELP4SjIC_dCXbKMZ344YiK9XD92Rg8n6jxzYxOkrj_IvkrrLx8VA$ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Says it's downloading HTTP Javadoc for about 10 seconds then 
>>>>>>>>> gives up. Javadoc of other older methods work just fine and I 
>>>>>>>>> didn't think it'd close the segment as other 
>>>>>>>>> withers(withAccessModes) do not.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So I wasn't able to read it, my bad.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Question: is this an absolute requirement because of technical 
>>>>>>>>> restrains? Could this be removed so that, for example, it is 
>>>>>>>>> possible to make an array segment unbound but have individual 
>>>>>>>>> array index segments be bound to a specific thread?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9/23/20 12:00 AM, sundararajan.athijegannathan at oracle.com 
>>>>>>>>> wrote:
>>>>>>>>>> javadoc of MemorySegment.withOwnerThread starts as follows:
>>>>>>>>>>
>>>>>>>>>> "    * Obtains a new memory segment backed by the same 
>>>>>>>>>> underlying memory region as this segment,
>>>>>>>>>>      * but with different owner thread. As a side-effect, 
>>>>>>>>>> this segment will be marked as <em>not alive</em>,
>>>>>>>>>>      * and subsequent operations on this segment will result 
>>>>>>>>>> in runtime errors.
>>>>>>>>>> "
>>>>>>>>>>
>>>>>>>>>> So the behavior seen is as the specification.
>>>>>>>>>>
>>>>>>>>>> -Sundar
>>>>>>>>>>
>>>>>>>>>> On 23/09/20 7:36 am, Ty Young wrote:
>>>>>>>>>>> A bug seems to have been introduced wherein using 
>>>>>>>>>>> withOwnerThread causes a MappedMemorySegment to close:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     File file = new File("./test");
>>>>>>>>>>>
>>>>>>>>>>>         if(file.exists())
>>>>>>>>>>>             file.delete();
>>>>>>>>>>>
>>>>>>>>>>>         file.createNewFile();
>>>>>>>>>>>
>>>>>>>>>>>         MappedMemorySegment nativeFile = 
>>>>>>>>>>> MemorySegment.mapFromPath(file.toPath(), 0, 4096, 
>>>>>>>>>>> FileChannel.MapMode.READ_WRITE);
>>>>>>>>>>>
>>>>>>>>>>> System.out.println(nativeFile.isAlive());
>>>>>>>>>>>
>>>>>>>>>>>         MappedMemorySegment segment = 
>>>>>>>>>>> (MappedMemorySegment)nativeFile.asSlice(4);
>>>>>>>>>>>         segment = (MappedMemorySegment)nativeFile.asSlice(4);
>>>>>>>>>>>         segment = 
>>>>>>>>>>> (MappedMemorySegment)nativeFile.asSlice(4).withOwnerThread(null); 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> System.out.println(nativeFile.isAlive());
>>>>>>>>>>>         System.out.println(segment.isAlive());
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> which prints true, false, and then true. The original 
>>>>>>>>>>> MappedMemorySegment was never closed, so this is unexpected.
>>>>>>>>>>>


More information about the panama-dev mailing list