RFR: JDK-8210337: runtime/NMT/VirtualAllocTestType.java failed on RuntimeException missing from stdout/stderr

Chris Plummer chris.plummer at oracle.com
Tue Oct 2 21:54:07 UTC 2018


On 10/2/18 2:38 PM, David Holmes wrote:
> Chris,
>
> On 3/10/2018 6:57 AM, Chris Plummer wrote:
>>
>>
>> On 10/2/18 1:44 PM, gary.adams at oracle.com wrote:
>>> The general attach sequence ...
>>>
>>> src/jdk.attach/solaris/classes/sun/tools/attach/VirtualMachineImpl.java
>>>
>>>  the attacher creates an attach_pid file in a directory where the 
>>> attachee is runnning
>>>  issues a signal to the attacheee
>>>
>>>   loops waiting for the java_pid file to be created
>>>   default timeout is 10 seconds
>>>
>> So getting a FileNotFoundException while in this loop is OK, but 
>> IOException is not.
>>
>>> src/hotspot/os/solaris/attachListener_solaris.cpp
>>>
>>>    attachee creates the java_pid file
>>>    listens til the attacher opens the door
>>>
>> I'm don't think this is related, but JDK-8199811 made a fix in 
>> attachListener_solaris.cpp to make it wait up to 10 seconds for 
>> initialization to complete before failing the enqueue.
>>
>>> ...
>>> Not sure when a bare IOException is thrown rather than the
>>> more specific FileNotFoundException.
>> Where is the IOException originating from? I wonder if the issue is 
>> that the file is in the process of being created, but is not fully 
>> created yet. Maybe it is there, but owner/group/permissions have not 
>> been set yet, and this results in an IOException instead of 
>> FileNotFoundException.
>
> The exception is shown in the bug report:
>
>  [java.io.IOException: Permission denied
> at jdk.attach/sun.tools.attach.VirtualMachineImpl.open(Native Method)
> at 
> jdk.attach/sun.tools.attach.VirtualMachineImpl.openDoor(VirtualMachineImpl.java:215)
> at 
> jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:71)
> at 
> jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
> at 
> jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
> at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:114)
> at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:98)
>
> And if you look at the native code the EPERM from open will cause 
> IOException to be thrown.
>
> ./jdk.attach/solaris/native/libattach/VirtualMachineImpl.c
>
> JNIEXPORT jint JNICALL Java_sun_tools_attach_VirtualMachineImpl_open
>   (JNIEnv *env, jclass cls, jstring path)
> {
>     jboolean isCopy;
>     const char* p = GetStringPlatformChars(env, path, &isCopy);
>     if (p == NULL) {
>         return 0;
>     } else {
>         int fd;
>         int err = 0;
>
>         fd = open(p, O_RDWR);
>         if (fd == -1) {
>             err = errno;
>         }
>
>         if (isCopy) {
>             JNU_ReleaseStringPlatformChars(env, path, p);
>         }
>
>         if (fd == -1) {
>             if (err == ENOENT) {
>                 JNU_ThrowByName(env, "java/io/FileNotFoundException", 
> NULL);
>             } else {
>                 char* msg = strdup(strerror(err));
>                 JNU_ThrowIOException(env, msg);
>                 if (msg != NULL) {
>                     free(msg);
>                 }
>
>
> We should add the path to the exception message.
>
Thanks David. So if EPERM is the error and a retry 100ms later works, I 
think that supports my hypothesis that the file is not quite fully 
created. So Gary's fix is probably fine. The only other possible fix I 
can think of that wouldn't require an explicit delay (or multiple 
retries) is probably not worth the complexity. It would require that the 
attachee create two files, and the attacher try to open the second file 
first. When it either opens or returns EPERM, you know the first file 
can safety be opened.

Chris
> David
> -----
>
>> Chris
>>>
>>>
>>>
>>> On 10/2/18 4:11 PM, Chris Plummer wrote:
>>>> Can you summarize how the attach handshaking is suppose to work? 
>>>> I'm just wondering why the attacher would ever be looking for the 
>>>> file before the attachee has created it. It seems a proper 
>>>> handshake would prevent this. Maybe there's some sort of visibility 
>>>> issue where the attachee has indeed created the file, but it is not 
>>>> immediately visible to the attacher process.
>>>>
>>>> Chris
>>>>
>>>> On 10/2/18 12:27 PM, gary.adams at oracle.com wrote:
>>>>> The problem reproduced pretty quickly.
>>>>> I added a call to checkPermission and revealed the
>>>>> "file not found" from the stat call when the IOException
>>>>> was detected.
>>>>>
>>>>> There has been some flakiness from the Solaris test machines today,
>>>>> so I'll continue with the testing a bit longer.
>>>>>
>>>>> On 10/2/18 3:12 PM, Chris Plummer wrote:
>>>>>> Without the fix was this issue easy enough to reproduce that you 
>>>>>> can be sure this is resolving it?
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 10/2/18 8:16 AM, Gary Adams wrote:
>>>>>>> Solaris debug builds are failing tests that use the attach 
>>>>>>> interface.
>>>>>>> An IOException is reported when the java_pid file is not opened.
>>>>>>>
>>>>>>> It appears that the attempt to attach is taking place too quickly.
>>>>>>> This workaround will allow the open operation to be retried
>>>>>>> after a short pause.
>>>>>>>
>>>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8210337/webrev/
>>>>>>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8210337
>>>>>>>
>>>>>>> Testing is in progress.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>




More information about the serviceability-dev mailing list