RFR: JDK-8210337: runtime/NMT/VirtualAllocTestType.java failed on RuntimeException missing from stdout/stderr

David Holmes david.holmes at oracle.com
Tue Oct 2 22:23:15 UTC 2018


Minor correction: EPERM -> EACCES for Solaris

Hard to see how to get a transient EACCES when opening a file ... though 
as it is really a door I guess there could be additional complexity.

David

On 3/10/2018 7:54 AM, Chris Plummer wrote:
> On 10/2/18 2:38 PM, David Holmes wrote:
>> Chris,
>>
>> On 3/10/2018 6:57 AM, Chris Plummer wrote:
>>>
>>>
>>> On 10/2/18 1:44 PM, gary.adams at oracle.com wrote:
>>>> The general attach sequence ...
>>>>
>>>> src/jdk.attach/solaris/classes/sun/tools/attach/VirtualMachineImpl.java
>>>>
>>>>  the attacher creates an attach_pid file in a directory where the 
>>>> attachee is runnning
>>>>  issues a signal to the attacheee
>>>>
>>>>   loops waiting for the java_pid file to be created
>>>>   default timeout is 10 seconds
>>>>
>>> So getting a FileNotFoundException while in this loop is OK, but 
>>> IOException is not.
>>>
>>>> src/hotspot/os/solaris/attachListener_solaris.cpp
>>>>
>>>>    attachee creates the java_pid file
>>>>    listens til the attacher opens the door
>>>>
>>> I'm don't think this is related, but JDK-8199811 made a fix in 
>>> attachListener_solaris.cpp to make it wait up to 10 seconds for 
>>> initialization to complete before failing the enqueue.
>>>
>>>> ...
>>>> Not sure when a bare IOException is thrown rather than the
>>>> more specific FileNotFoundException.
>>> Where is the IOException originating from? I wonder if the issue is 
>>> that the file is in the process of being created, but is not fully 
>>> created yet. Maybe it is there, but owner/group/permissions have not 
>>> been set yet, and this results in an IOException instead of 
>>> FileNotFoundException.
>>
>> The exception is shown in the bug report:
>>
>>  [java.io.IOException: Permission denied
>> at jdk.attach/sun.tools.attach.VirtualMachineImpl.open(Native Method)
>> at 
>> jdk.attach/sun.tools.attach.VirtualMachineImpl.openDoor(VirtualMachineImpl.java:215) 
>>
>> at 
>> jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:71) 
>>
>> at 
>> jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58) 
>>
>> at 
>> jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207) 
>>
>> at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:114)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:98)
>>
>> And if you look at the native code the EPERM from open will cause 
>> IOException to be thrown.
>>
>> ./jdk.attach/solaris/native/libattach/VirtualMachineImpl.c
>>
>> JNIEXPORT jint JNICALL Java_sun_tools_attach_VirtualMachineImpl_open
>>   (JNIEnv *env, jclass cls, jstring path)
>> {
>>     jboolean isCopy;
>>     const char* p = GetStringPlatformChars(env, path, &isCopy);
>>     if (p == NULL) {
>>         return 0;
>>     } else {
>>         int fd;
>>         int err = 0;
>>
>>         fd = open(p, O_RDWR);
>>         if (fd == -1) {
>>             err = errno;
>>         }
>>
>>         if (isCopy) {
>>             JNU_ReleaseStringPlatformChars(env, path, p);
>>         }
>>
>>         if (fd == -1) {
>>             if (err == ENOENT) {
>>                 JNU_ThrowByName(env, "java/io/FileNotFoundException", 
>> NULL);
>>             } else {
>>                 char* msg = strdup(strerror(err));
>>                 JNU_ThrowIOException(env, msg);
>>                 if (msg != NULL) {
>>                     free(msg);
>>                 }
>>
>>
>> We should add the path to the exception message.
>>
> Thanks David. So if EPERM is the error and a retry 100ms later works, I 
> think that supports my hypothesis that the file is not quite fully 
> created. So Gary's fix is probably fine. The only other possible fix I 
> can think of that wouldn't require an explicit delay (or multiple 
> retries) is probably not worth the complexity. It would require that the 
> attachee create two files, and the attacher try to open the second file 
> first. When it either opens or returns EPERM, you know the first file 
> can safety be opened.
> 
> Chris
>> David
>> -----
>>
>>> Chris
>>>>
>>>>
>>>>
>>>> On 10/2/18 4:11 PM, Chris Plummer wrote:
>>>>> Can you summarize how the attach handshaking is suppose to work? 
>>>>> I'm just wondering why the attacher would ever be looking for the 
>>>>> file before the attachee has created it. It seems a proper 
>>>>> handshake would prevent this. Maybe there's some sort of visibility 
>>>>> issue where the attachee has indeed created the file, but it is not 
>>>>> immediately visible to the attacher process.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 10/2/18 12:27 PM, gary.adams at oracle.com wrote:
>>>>>> The problem reproduced pretty quickly.
>>>>>> I added a call to checkPermission and revealed the
>>>>>> "file not found" from the stat call when the IOException
>>>>>> was detected.
>>>>>>
>>>>>> There has been some flakiness from the Solaris test machines today,
>>>>>> so I'll continue with the testing a bit longer.
>>>>>>
>>>>>> On 10/2/18 3:12 PM, Chris Plummer wrote:
>>>>>>> Without the fix was this issue easy enough to reproduce that you 
>>>>>>> can be sure this is resolving it?
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 10/2/18 8:16 AM, Gary Adams wrote:
>>>>>>>> Solaris debug builds are failing tests that use the attach 
>>>>>>>> interface.
>>>>>>>> An IOException is reported when the java_pid file is not opened.
>>>>>>>>
>>>>>>>> It appears that the attempt to attach is taking place too quickly.
>>>>>>>> This workaround will allow the open operation to be retried
>>>>>>>> after a short pause.
>>>>>>>>
>>>>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8210337/webrev/
>>>>>>>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8210337
>>>>>>>>
>>>>>>>> Testing is in progress.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
> 
> 


More information about the serviceability-dev mailing list