RFR: JDK-8210337: runtime/NMT/VirtualAllocTestType.java failed on RuntimeException missing from stdout/stderr
David Holmes
david.holmes at oracle.com
Tue Oct 2 22:23:15 UTC 2018
Minor correction: EPERM -> EACCES for Solaris
Hard to see how to get a transient EACCES when opening a file ... though
as it is really a door I guess there could be additional complexity.
David
On 3/10/2018 7:54 AM, Chris Plummer wrote:
> On 10/2/18 2:38 PM, David Holmes wrote:
>> Chris,
>>
>> On 3/10/2018 6:57 AM, Chris Plummer wrote:
>>>
>>>
>>> On 10/2/18 1:44 PM, gary.adams at oracle.com wrote:
>>>> The general attach sequence ...
>>>>
>>>> src/jdk.attach/solaris/classes/sun/tools/attach/VirtualMachineImpl.java
>>>>
>>>> the attacher creates an attach_pid file in a directory where the
>>>> attachee is runnning
>>>> issues a signal to the attacheee
>>>>
>>>> loops waiting for the java_pid file to be created
>>>> default timeout is 10 seconds
>>>>
>>> So getting a FileNotFoundException while in this loop is OK, but
>>> IOException is not.
>>>
>>>> src/hotspot/os/solaris/attachListener_solaris.cpp
>>>>
>>>> attachee creates the java_pid file
>>>> listens til the attacher opens the door
>>>>
>>> I'm don't think this is related, but JDK-8199811 made a fix in
>>> attachListener_solaris.cpp to make it wait up to 10 seconds for
>>> initialization to complete before failing the enqueue.
>>>
>>>> ...
>>>> Not sure when a bare IOException is thrown rather than the
>>>> more specific FileNotFoundException.
>>> Where is the IOException originating from? I wonder if the issue is
>>> that the file is in the process of being created, but is not fully
>>> created yet. Maybe it is there, but owner/group/permissions have not
>>> been set yet, and this results in an IOException instead of
>>> FileNotFoundException.
>>
>> The exception is shown in the bug report:
>>
>> [java.io.IOException: Permission denied
>> at jdk.attach/sun.tools.attach.VirtualMachineImpl.open(Native Method)
>> at
>> jdk.attach/sun.tools.attach.VirtualMachineImpl.openDoor(VirtualMachineImpl.java:215)
>>
>> at
>> jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:71)
>>
>> at
>> jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
>>
>> at
>> jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
>>
>> at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:114)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:98)
>>
>> And if you look at the native code the EPERM from open will cause
>> IOException to be thrown.
>>
>> ./jdk.attach/solaris/native/libattach/VirtualMachineImpl.c
>>
>> JNIEXPORT jint JNICALL Java_sun_tools_attach_VirtualMachineImpl_open
>> (JNIEnv *env, jclass cls, jstring path)
>> {
>> jboolean isCopy;
>> const char* p = GetStringPlatformChars(env, path, &isCopy);
>> if (p == NULL) {
>> return 0;
>> } else {
>> int fd;
>> int err = 0;
>>
>> fd = open(p, O_RDWR);
>> if (fd == -1) {
>> err = errno;
>> }
>>
>> if (isCopy) {
>> JNU_ReleaseStringPlatformChars(env, path, p);
>> }
>>
>> if (fd == -1) {
>> if (err == ENOENT) {
>> JNU_ThrowByName(env, "java/io/FileNotFoundException",
>> NULL);
>> } else {
>> char* msg = strdup(strerror(err));
>> JNU_ThrowIOException(env, msg);
>> if (msg != NULL) {
>> free(msg);
>> }
>>
>>
>> We should add the path to the exception message.
>>
> Thanks David. So if EPERM is the error and a retry 100ms later works, I
> think that supports my hypothesis that the file is not quite fully
> created. So Gary's fix is probably fine. The only other possible fix I
> can think of that wouldn't require an explicit delay (or multiple
> retries) is probably not worth the complexity. It would require that the
> attachee create two files, and the attacher try to open the second file
> first. When it either opens or returns EPERM, you know the first file
> can safety be opened.
>
> Chris
>> David
>> -----
>>
>>> Chris
>>>>
>>>>
>>>>
>>>> On 10/2/18 4:11 PM, Chris Plummer wrote:
>>>>> Can you summarize how the attach handshaking is suppose to work?
>>>>> I'm just wondering why the attacher would ever be looking for the
>>>>> file before the attachee has created it. It seems a proper
>>>>> handshake would prevent this. Maybe there's some sort of visibility
>>>>> issue where the attachee has indeed created the file, but it is not
>>>>> immediately visible to the attacher process.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 10/2/18 12:27 PM, gary.adams at oracle.com wrote:
>>>>>> The problem reproduced pretty quickly.
>>>>>> I added a call to checkPermission and revealed the
>>>>>> "file not found" from the stat call when the IOException
>>>>>> was detected.
>>>>>>
>>>>>> There has been some flakiness from the Solaris test machines today,
>>>>>> so I'll continue with the testing a bit longer.
>>>>>>
>>>>>> On 10/2/18 3:12 PM, Chris Plummer wrote:
>>>>>>> Without the fix was this issue easy enough to reproduce that you
>>>>>>> can be sure this is resolving it?
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 10/2/18 8:16 AM, Gary Adams wrote:
>>>>>>>> Solaris debug builds are failing tests that use the attach
>>>>>>>> interface.
>>>>>>>> An IOException is reported when the java_pid file is not opened.
>>>>>>>>
>>>>>>>> It appears that the attempt to attach is taking place too quickly.
>>>>>>>> This workaround will allow the open operation to be retried
>>>>>>>> after a short pause.
>>>>>>>>
>>>>>>>> Webrev: http://cr.openjdk.java.net/~gadams/8210337/webrev/
>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8210337
>>>>>>>>
>>>>>>>> Testing is in progress.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>
>
More information about the serviceability-dev
mailing list