RFR: 8176717: GC log file handle leaked to child processes

Thomas Stüfe thomas.stuefe at gmail.com
Mon Mar 12 16:48:58 UTC 2018


Hi Leo,

On Mon, Mar 12, 2018 at 4:54 PM, Leo Korinth <leo.korinth at oracle.com> wrote:

>
>
> On 12/03/18 15:29, Thomas Stüfe wrote:
>
>> Hi Leo,
>>
>> This seems weird.
>>
>> This would affect numerous open() calls, not just this GC log, I cannot
>> imagine the correct fix is to change all of them.
>>
>
> Sorry, I do not understand what you mean with "numerous open()". This fix
> will only affect logging -- or am I missing something? os::open does
> roughly what I try to do in os::fopen_retain.
>
>
Sorry, I spoke unclear. What I meant was I would expect the problem you
found in gc logging to be present for every raw open()/fopen()/CreateFile()
call in the VM and in the JDK, which are quite a few. I wondered why we do
not see more problems like this.



>
>> In fact, on Posix platforms we close all file descriptors except the Pipe
>> ones before between fork() and exec() - see unix/native/libjava/
>> childproc.c.
>>
>
> Yes, that is why my test case did not fail before the fix on unix-like
> systems. I do not know why it is not handled in Windows (possibly a bug,
> possibly to keep old behaviour???), I had planned to ask that as a follow
> up question later, maybe open a bug report if it was not for keeping old
> behaviour. Even though childproc.c does close the file handler, I think it
> is much nicer to open them with FD_CLOEXEC (in addition to let childproc.c
> close it). os::open does so, and I would like to handle ::fopen the same
> way as ::open with a proxy call that ensures that the VM process will
> retain the file descriptor it opens (in HotSpot at least).
>
> Such code is missing on Windows - see windows/native/libjava/ProcessImpl_md.c
>>  . There, we do not have fork/exec, but CreateProcess(), and whether we
>> inherit handles or not is controlled via an argument to CreateProcess().
>> But that flag is TRUE, so child processes inherit handles.
>>
>> 331                    if (!CreateProcessW(
>> 332                        NULL,             /* executable name */
>> 333                        (LPWSTR)pcmd,     /* command line */
>> 334                        NULL,             /* process security
>> attribute */
>> 335                        NULL,             /* thread security attribute
>> */
>> 336                        TRUE,             /* inherits system handles
>> */          <<<<<<
>> 337                        processFlag,      /* selected based on exe
>> type */
>> 338                        (LPVOID)penvBlock,/* environment block */
>> 339                        (LPCWSTR)pdir,    /* change to the new current
>> directory */
>> 340                        &si,              /* (in)  startup information
>> */
>> 341                        &pi))             /* (out) process information
>> */
>> 342                    {
>> 343                        win32Error(env, L"CreateProcess");
>> 344                    }
>>
>> Maybe this is the real error we should fix? Make Windows Runtime.exec
>> behave like the Posix variant by closing all file descriptors upon
>> CreateProcess >
>> (This seems more of a core-libs question.)
>>
>
> I think it is both a core-libs question and a hotspot question. I firmly
> believe we should retain file descriptors with help of FD_CLOEXEC and its
> variants in HotSpot. I am unsure (and have no opinion) what to do in
> core-libs, maybe there is a deeper thought behind line 336?
>
> Some reasons for this:
>
> - if a process is forked using JNI, it would still be good if the hotspot
> descriptors would not leak.
>
> - if (I have no idea if this is true) the behaviour in core-libs can not
> be changed because the behaviour is already wildly (ab)used, this is still
> a correct fix. Remember this will only close file descriptors opened by
> HotSpot code, and at the moment only logging code.
>
> - this will fix the issue in the bug report, and give time for core-libs
> to consider what is correct (and what can be changed without breaking
> applications).
>
> Thanks,
> Leo
>
>
yes, you convinced me.

1 We should fix raw open() calls, because if native code forks via a
different code paths than java Runtime.exec(), we run into the same
problem. Your patch fixes one instance of the problem.

2 And we should fix Windows Runtime.exec() to the same behaviour as on
Posix. I can see this being backward-compatible-problematic, but it
certainly would be the right thing to do. Would love to know what core-libs
says.

Okay, about your change: I dislike that we add a new function, especially a
first class open function, to the os namespace. How about this instead:
since we know that os::open() does the right thing on all platforms, why
can we not just use os::open() instead? Afterwards call fdopen() to wrap a
FILE structure around it, respectively call "FILE* os::open(int fd, const
char* mode)" , which seems to be just a wrapped fdopen(). That way you can
get what you want with less change and without introducing a new API.

Kind Regards, Thomas




>
>> Kind Regards, Thomas
>>
>>
>> On Mon, Mar 12, 2018 at 2:20 PM, Leo Korinth <leo.korinth at oracle.com
>> <mailto:leo.korinth at oracle.com>> wrote:
>>
>>     Hi,
>>
>>     This fix is for all operating systems though the problem only seams
>>     to appear on windows.
>>
>>     I am creating a proxy function for fopen (os::fopen_retain) that
>>     appends the non-standard "e" mode for linux and bsds. For windows
>>     the "N" mode is used. For other operating systems, I assume that I
>>     can use fcntl F_SETFD FD_CLOEXEC. I think this will work for AIX,
>>     Solaris and other operating systems that do not support the "e"
>>     flag. Feedback otherwise please!
>>
>>     The reason that I use the mode "e" and not only fcntl for linux and
>>     bsds is threefold. First, I still need to use mode flags on windows
>>     as it does not support fcntl. Second, I probably save a system call.
>>     Third, the change will be applied directly, and there will be no
>>     point in time (between system calls) when the process can leak the
>>     file descriptor, so it is safer.
>>
>>     The test case forks three VMs in a row. By doing so we know that the
>>     second VM is opened with a specific log file. The third VM should
>>     have less open file descriptors (as it is does not use logging)
>>     which is checked using a UnixOperatingSystemMXBean. This is not
>>     possible on windows, so on windows I try to rename the file, which
>>     will not work if the file is opened (the actual reason the bug was
>>     opened).
>>
>>     The added test case shows that the bug fix closes the log file on
>>     windows. The VM on other operating systems closed the log file even
>>     before the fix.
>>
>>     Maybe the test case should be moved to a different path?
>>
>>     Bug:
>>     https://bugs.openjdk.java.net/browse/JDK-8176717
>>     <https://bugs.openjdk.java.net/browse/JDK-8176717>
>>     https://bugs.openjdk.java.net/browse/JDK-8176809
>>     <https://bugs.openjdk.java.net/browse/JDK-8176809>
>>
>>     Webrev:
>>     http://cr.openjdk.java.net/~lkorinth/8176717/00/
>>     <http://cr.openjdk.java.net/~lkorinth/8176717/00/>
>>
>>     Testing:
>>     hs-tier1, hs-tier2 and TestInheritFD.java
>>     (on 64-bit linux, solaris, windows and mac)
>>
>>     Thanks,
>>     Leo
>>
>>
>>


More information about the hotspot-dev mailing list