RFR 7162400: Intermittent java.io.IOException: Bad file number during HotSpotVirtualMachine.executeCommand
Mikael Gerdin
mikael.gerdin at oracle.com
Tue Jul 9 05:48:34 PDT 2013
Peter,
On 2013-07-09 14:25, Peter Allwin wrote:
> Hello!
>
> It is reproducible by letting the test create .java_pid* files for all
> possible process id’s on the system, setting correct access flags,
> launching the target VM and attempting to connect. There are some
> caveats though but it should be doable.
>
> I’ll convert the repro script to JTREG and add it to the webrev.
It's probably not a good idea to have a test which taints the system
with stale .java_pid* files.
If the test execution times out and the script isn't allowed to clean up
I imagine that other subsequent executions could fail.
Is there a way to tell the attach api to use a specific directory so you
won't need to taint /tmp?
/Mikael
>
> Thanks for the reviews!
>
> /peter
>
> *From:*serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
> *Sent:* Tuesday, July 9, 2013 1:26 AM
> *To:* daniel.daugherty at oracle.com
> *Cc:* Peter Allwin; serviceability-dev at openjdk.java.net;
> hotspot-runtime-dev at openjdk.java.net
> *Subject:* Re: RFR 7162400: Intermittent java.io.IOException: Bad file
> number during HotSpotVirtualMachine.executeCommand
>
> Ok, thanks!
>
> Peter, did you manage to reproduce this issue with your script?
> If so, then, please, include it into the bug report and remove the
> "noreg-sqe" label.
>
> It is Ok if you did not reproduce it, though.
>
> Thanks,
> Serguei
>
>
> On 7/8/13 4:20 PM, Daniel D. Daugherty wrote:
>
> I definitely don't insist... :-)
>
> BTW, I noticed this in Peter's e-mail:
>
> > Testing:
> > JPRT, reproducing script on Solaris, Linux.
>
> so maybe Peter already has this covered with "reproducing script"...
>
> Dan
>
> On 7/8/13 5:07 PM, serguei.spitsyn at oracle.com
> <mailto:serguei.spitsyn at oracle.com> wrote:
>
> Dan,
>
> Dan, thank you for the recommendation.
> But I'm still not sure it is a right thing to do.
> Even though, there are multiple test cases associated with this
> bug they
> can not be used to verify that fix because an additional condition
> must be present as well.
> This condition is a presence of stale door file which is not
> that easy to reproduce.
>
> However, if you insist then I can change the lable to the
> "noreg-sqe"
> with the corresponding comment.
>
> Thanks,
> Serguei
>
>
> On 7/8/13 3:46 PM, Daniel D. Daugherty wrote:
>
> Serguei,
>
> There are a number of existing tests associated with this
> bug. I don't
> think that 'noreg-hard' is the right label. I think
> 'noreg-sqe' is
> the right one:
>
> noreg-sqe
> Change can be verified by running an existing SQE test
> suite; the bug
> should identify the suite and the specific test case(s).
>
> Dan
>
> On 7/8/13 12:59 PM, serguei.spitsyn at oracle.com
> <mailto:serguei.spitsyn at oracle.com> wrote:
>
> Peter,
>
> I've added the label "noreg-hard" with the comment to
> the report.
> It is not easy to reproduce the issue and demonstrate
> the fix in a regression test.
>
> Thanks,
> Serguei
>
>
> On 7/8/13 11:36 AM, serguei.spitsyn at oracle.com
> <mailto:serguei.spitsyn at oracle.com> wrote:
>
> Hi Peter,
>
> The fix looks good.
>
> Thanks,
> Serguei
>
> On 7/8/13 6:54 AM, Peter Allwin wrote:
>
> Hello!
>
> Looking for reviews of this change:
>
> http://cr.openjdk.java.net/~allwin/7162400/webrev.01/
> <http://cr.openjdk.java.net/%7Eallwin/7162400/webrev.01/>
>
> For CR:
>
> http://bugs.sun.com/view_bug.do?bug_id=7162400
>
> https://jbs.oracle.com/bugs/browse/JDK-7162400
>
> Summary:
>
> This change addresses an issue in the Attach API
> on Solaris, Linux and BSD where an attaching
> application can receive IOExceptions such as
> “Bad file number” (Solaris), “Connection
> refused” (Linux/BSD), or “well-known file is not
> secure”.
>
> The attach process uses a file in the temporary
> directory as a door (Solaris) or domain socket
> (Linux,BSD) to communicate with the VM. In
> certain circumstances stale files can be left in
> the file system which can cause the attaching
> application to believe that the VM is ready to
> receive a connection when it’s not. With this
> change the stale file will be removed during VM
> startup.
>
> Note that there is still an issue if we don’t
> have permission to remove the stale file, the
> attaching process will fail to connect.
>
> Testing:
>
> JPRT, reproducing script on Solaris, Linux.
>
> Credits:
>
> Thanks to Staffan Larsen who worked on this
> issue with me.
>
> Regards,
>
>
> Peter
>
More information about the serviceability-dev
mailing list