BUG: withOwnerThread closes MappedMemorySegment

Ty Young youngty1997 at gmail.com
Fri Sep 25 18:43:57 UTC 2020


Hi Jorn,


Tried applying the patch but got:


error: patch failed: src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp:3810
error: src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp: patch does not apply


I'm using the foreign-abi branch using the newest source on Github and 
am applying it at the root of the source(panama-foreign-foreign-abi).


There isn't really anything special about the code. The attribute code 
just wraps the NVML function calls in an "update" interface method which 
are then individually updated in a thread pool every 500 milliseconds:


     @Override
     public synchronized NVMLReturnValue update()
     {
         NVMLReturnValue returnValue = null;

         try
         {
             returnValue = 
NVMLReturnValue.getValue(nvml_h.nvmlDeviceGetCurrentClocksThrottleReasons(super.getNVDevice().get().getNativePointer(), 
this.performanceLimit)).get();
         }
         catch (Throwable ex)
         {
             ex.printStackTrace();
         }

         super.finishUpdate(returnValue, 
NVActivityState.getValue(NVMLPerformanceLimit.SOFTWARE_THERMAL_SLOWDOWN.containsMask(this.performanceLimit.longValue())));

         return returnValue;
     }


I can make a jlink distribution and upload it to Google or something. 
The Maven plugin that I use just dumps the jar files into a single 
directory because they aren't all modular, so it should be possible to 
specify that directory as the module path with a custom JDK.


On 9/25/20 10:00 AM, Jorn Vernee wrote:
> Hi Ty,
>
> I'm trying to reproduce your crash locally, but no luck so far. From 
> the error message, it seems like it might be caused by a thread being 
> suspended while inside a native call, which triggers a slow path that 
> seems to have a bug in it; not setting the thread state to 
> _thread_in_java again.
>
> The following patch fixes that mistake [1], but since I'm not able so 
> far to reproduce the crash I can't say for sure that that will fix the 
> problem you are seeing.
>
> It would help if you could provide some steps to reproduce, and/or try 
> out the patch at [1] if you can (should be enough to put it in a file 
> and then use `git apply <file>`).
>
> I'm also curious to know if you are running with the default GC, or 
> maybe another one?
>
> Thanks,
> Jorn
>
> [1] :
>
> diff --git a/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp 
> b/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
> index 51b9082cfae..887369fc5c8 100644
> --- a/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
> +++ b/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
> @@ -3810,11 +3810,12 @@ void NativeInvokerGenerator::generate() {
>    __ safepoint_poll(L_safepoint_poll_slow_path, r15_thread, rscratch1);
>    __ cmpl(Address(r15_thread, JavaThread::suspend_flags_offset()), 0);
>    __ jcc(Assembler::notEqual, L_safepoint_poll_slow_path);
> -  // change thread state
> -  __ movl(Address(r15_thread, JavaThread::thread_state_offset()), 
> _thread_in_Java);
>
>    __ bind(L_after_safepoint_poll);
>
> +  // change thread state
> +  __ movl(Address(r15_thread, JavaThread::thread_state_offset()), 
> _thread_in_Java);
> +
>    __ block_comment("reguard stack check");
>    Label L_reguard;
>    Label L_after_reguard
>
> On 25/09/2020 15:54, Maurizio Cimadamore wrote:
>>
>> On 25/09/2020 14:33, Ty Young wrote:
>>>
>>> On 9/25/20 4:58 AM, Maurizio Cimadamore wrote:
>>>> One idea - we have recently turned on the method handle intrinsics 
>>>> - try running with these parameters:
>>>>
>>>> -Djdk.internal.foreign.ProgrammableInvoker.USE_SPEC=false 
>>>> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=false
>>>>
>>>> And see if the crash still happens.
>>>
>>>
>>> Seems like that is the issue. I can't say 100% because there are 
>>> run-to-run outliers where it'll sometimes get past the 6-8 minute 
>>> mark and run for awhile, only to eventually crash. Nothing yet with 
>>> those turned off though.
>>
>> Ok, on our hand, it seems like Jorn has identified a potential 
>> culprit on the intrinsics...
>>
>> I'd suggest to run with intrinsics disabled for the time being.
>>
>> Thanks
>> Maurizio
>>
>>>
>>>
>>>>
>>>> Maurizio
>>>>
>>>> On 25/09/2020 01:46, Ty Young wrote:
>>>>>
>>>>> On 9/24/20 4:56 AM, Maurizio Cimadamore wrote:
>>>>>>
>>>>>> On 24/09/2020 04:01, Ty Young wrote:
>>>>>>>> There are two main issues with "just mutating the existining 
>>>>>>>> segments":
>>>>>>>>
>>>>>>>> 1) First and foremost, C2 likes constants. So, if the thread is 
>>>>>>>> constant on the segment we pay nothing for the ownership (or 
>>>>>>>> lack of ownership) check.
>>>>>>>>
>>>>>>>> 2) Second, you can't just "flip" a switch on a shared data 
>>>>>>>> structure, and expect the update to be seen by the rest of the 
>>>>>>>> world (in a multi-thread scenario). On the other hand, 
>>>>>>>> publishing a "new" object has more guarantees on who can do 
>>>>>>>> what with that new object (meaning it's physically not possible 
>>>>>>>> for other threads to start writing on the new segment before it 
>>>>>>>> has been returned by the API)
>>>>>>>>
>>>>>>>> Note that the original segment is killed (meaning its isAlive 
>>>>>>>> returns false) but the new segment still points at the same 
>>>>>>>> mappped memory region. So if you want a shared memory segment, 
>>>>>>>> you just have to do:
>>>>>>>>
>>>>>>>> MappedMemorySegment nativeFile = 
>>>>>>>> MemorySegment.mapFromPath(file.toPath(), 0, 4096, 
>>>>>>>> FileChannel.MapMode.READ_WRITE)
>>>>>>>> .withOwnerThread(null);
>>>>>>>
>>>>>>>
>>>>>>> Yeah, figured that'd work. I had to add an if-else to check if a 
>>>>>>> segment is already unbound, which got me past that error, 
>>>>>>> although my projects still aren't running because of a 
>>>>>>> StackOverflow and "java.lang.NoClassDefFoundError: Could not 
>>>>>>> initialize class java.lang.StackTraceElement$HashedModules" 
>>>>>>> error related to MethodHandles that are being added to struct 
>>>>>>> layouts. Not 100% sure what's going on there...
>>>>>>
>>>>>> That looks odd - if you could send the entire stack trace we 
>>>>>> could perhaps quickly glance if it's coming from our API, or your 
>>>>>> code.
>>>>>
>>>>>
>>>>> Yeah, it was my code, sorry. My application at least runs now... 
>>>>> for maybe about 6-8 minutes... until it crashes.
>>>>>
>>>>>
>>>>> Multiple crashes look like this:
>>>>>
>>>>>
>>>>> Current thread (0x00007f17e41a4f30):  JavaThread "pool-2-thread-2" 
>>>>> [_thread_in_native_trans, id=2961304, 
>>>>> stack(0x00007f17d09a5000,0x00007f17d0aa6000)]
>>>>>
>>>>> Stack: [0x00007f17d09a5000,0x00007f17d0aa6000], 
>>>>> sp=0x00007f17d0aa47b8,  free space=1021k
>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, 
>>>>> j=interpreted, Vv=VM code, C=native code)
>>>>> J 9169 c2 
>>>>> java.lang.invoke.LambdaForm$MH+0x0000000800eb2040.invoke(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)I 
>>>>> java.base at 16-internal (89 bytes) @ 0x00007f185c85867a 
>>>>> [0x00007f185c858620+0x000000000000005a]
>>>>>
>>>>> [error occurred during error reporting (printing native stack), id 
>>>>> 0xb, SIGSEGV (0xb) at pc=0x00007f1871cc7075]
>>>>>
>>>>>
>>>>> Although a small percentage look like:
>>>>>
>>>>>
>>>>> Current thread (0x00007f300c0c1220):  JavaThread "pool-2-thread-9" 
>>>>> [_thread_in_native_trans, id=3014594, 
>>>>> stack(0x00007f3001f48000,0x00007f3002049000)]
>>>>>
>>>>> Stack: [0x00007f3001f48000,0x00007f3002049000], 
>>>>> sp=0x00007f3002047798,  free space=1021k
>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, 
>>>>> j=interpreted, Vv=VM code, C=native code)
>>>>> J 2028 c2 
>>>>> jdk.incubator.foreign.ValueLayout.attribute(Ljava/lang/String;)Ljava/util/Optional; 
>>>>> jdk.incubator.foreign at 16-internal (6 bytes) @ 0x00007f30984f1872 
>>>>> [0x00007f30984f1400+0x0000000000000472]
>>>>>
>>>>> [error occurred during error reporting (printing native stack), id 
>>>>> 0xb, SIGSEGV (0xb) at pc=0x00007f30afa54075]
>>>>>
>>>>>
>>>>> Errors always seem to be in relation to compilation events:
>>>>>
>>>>>
>>>>> Compiled method (c1)  357892 5293  s!   3 
>>>>> org.goliath.envious.nvml.local.attributes.performance.limiters.NVMLGPUPerformanceLimitSoftwareThermalSlowdownAttribute::update 
>>>>> (63 bytes)
>>>>>
>>>>>
>>>>> Is there any JVM or panama debug arguments I can use to find the 
>>>>> issue?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Cheers
>>>>>> Maurizio
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> E.g. use chaining to get to the segment you want to construct.
>>>>>>>>
>>>>>>>> As for the interaction with asSlice() - a slice is not a true 
>>>>>>>> standalone segment - is just another view of the existing 
>>>>>>>> segment with different bounds - in fact closing the slice also 
>>>>>>>> closes the parent segments. So, calling withOwnerThread will 
>>>>>>>> also kill all associated slices (or, calling withCleanupAction 
>>>>>>>> will attach the cleanup action to all the segments which share 
>>>>>>>> the same temporal bound).
>>>>>>>>
>>>>>>>> There is, currently, no way to create a segment, and then slice 
>>>>>>>> it so that you can N segments each owned by a different thread, 
>>>>>>>> if that's what you are looking for. We looked into that and 
>>>>>>>> it's way too messy (not only you have to worry about how to 
>>>>>>>> "split" the segment, but also about how you merge it back - and 
>>>>>>>> that's the hard part). So, if you are not ok with confinement, 
>>>>>>>> create a shared segment and work with it (e.g. like it was a 
>>>>>>>> ByteBuffer or a Java array). It is up to you then to make sure 
>>>>>>>> that multiple threads operate on the segment in a way that 
>>>>>>>> makes sense (either by assigning disjoint slices to different 
>>>>>>>> threads, e.g. using a spliterator, which the API supports), or 
>>>>>>>> by using some synchronization.
>>>>>>>
>>>>>>>
>>>>>>> Figured I'd have to do manual synchronization.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Maurizio
>>>>>>>>
>>>>>>>>
>>>>>>>> On 23/09/2020 07:47, sundararajan.athijegannathan at oracle.com 
>>>>>>>> wrote:
>>>>>>>>> Can you access source?
>>>>>>>>>
>>>>>>>>> https://urldefense.com/v3/__https://github.com/openjdk/panama-foreign/blob/foreign-jextract/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/MemorySegment.java__;!!GqivPVa7Brio!OlbtDRpO3iKUiqjySSQELP4SjIC_dCXbKMZ344YiK9XD92Rg8n6jxzYxOkrj_IvkZ8Tlf78$ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, afaik it is implementation technical reason. (Maurizio 
>>>>>>>>> will clarify). That said, it should be possible to map 
>>>>>>>>> different file segments to *different* memory segments with 
>>>>>>>>> different owners, right? What's the advantage of mapping the 
>>>>>>>>> whole & slice to own parts by different threads?
>>>>>>>>>
>>>>>>>>> -Sundar
>>>>>>>>>
>>>>>>>>> On 23/09/20 11:02 am, Ty Young wrote:
>>>>>>>>>> Javadoc isn't working for withOwnerThread or 
>>>>>>>>>> withCleanupAction for me on Netbeans:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://urldefense.com/v3/__https://imgur.com/a/Q21pZAC__;!!GqivPVa7Brio!OlbtDRpO3iKUiqjySSQELP4SjIC_dCXbKMZ344YiK9XD92Rg8n6jxzYxOkrj_IvkrrLx8VA$ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Says it's downloading HTTP Javadoc for about 10 seconds then 
>>>>>>>>>> gives up. Javadoc of other older methods work just fine and I 
>>>>>>>>>> didn't think it'd close the segment as other 
>>>>>>>>>> withers(withAccessModes) do not.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So I wasn't able to read it, my bad.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Question: is this an absolute requirement because of 
>>>>>>>>>> technical restrains? Could this be removed so that, for 
>>>>>>>>>> example, it is possible to make an array segment unbound but 
>>>>>>>>>> have individual array index segments be bound to a specific 
>>>>>>>>>> thread?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9/23/20 12:00 AM, sundararajan.athijegannathan at oracle.com 
>>>>>>>>>> wrote:
>>>>>>>>>>> javadoc of MemorySegment.withOwnerThread starts as follows:
>>>>>>>>>>>
>>>>>>>>>>> "    * Obtains a new memory segment backed by the same 
>>>>>>>>>>> underlying memory region as this segment,
>>>>>>>>>>>      * but with different owner thread. As a side-effect, 
>>>>>>>>>>> this segment will be marked as <em>not alive</em>,
>>>>>>>>>>>      * and subsequent operations on this segment will result 
>>>>>>>>>>> in runtime errors.
>>>>>>>>>>> "
>>>>>>>>>>>
>>>>>>>>>>> So the behavior seen is as the specification.
>>>>>>>>>>>
>>>>>>>>>>> -Sundar
>>>>>>>>>>>
>>>>>>>>>>> On 23/09/20 7:36 am, Ty Young wrote:
>>>>>>>>>>>> A bug seems to have been introduced wherein using 
>>>>>>>>>>>> withOwnerThread causes a MappedMemorySegment to close:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     File file = new File("./test");
>>>>>>>>>>>>
>>>>>>>>>>>>         if(file.exists())
>>>>>>>>>>>>             file.delete();
>>>>>>>>>>>>
>>>>>>>>>>>>         file.createNewFile();
>>>>>>>>>>>>
>>>>>>>>>>>>         MappedMemorySegment nativeFile = 
>>>>>>>>>>>> MemorySegment.mapFromPath(file.toPath(), 0, 4096, 
>>>>>>>>>>>> FileChannel.MapMode.READ_WRITE);
>>>>>>>>>>>>
>>>>>>>>>>>> System.out.println(nativeFile.isAlive());
>>>>>>>>>>>>
>>>>>>>>>>>>         MappedMemorySegment segment = 
>>>>>>>>>>>> (MappedMemorySegment)nativeFile.asSlice(4);
>>>>>>>>>>>>         segment = (MappedMemorySegment)nativeFile.asSlice(4);
>>>>>>>>>>>>         segment = 
>>>>>>>>>>>> (MappedMemorySegment)nativeFile.asSlice(4).withOwnerThread(null); 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> System.out.println(nativeFile.isAlive());
>>>>>>>>>>>> System.out.println(segment.isAlive());
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> which prints true, false, and then true. The original 
>>>>>>>>>>>> MappedMemorySegment was never closed, so this is unexpected.
>>>>>>>>>>>>


More information about the panama-dev mailing list