RFR: 8176717: GC log file handle leaked to child processes
Thomas Stüfe
thomas.stuefe at gmail.com
Mon Mar 12 16:48:58 UTC 2018
Hi Leo,
On Mon, Mar 12, 2018 at 4:54 PM, Leo Korinth <leo.korinth at oracle.com> wrote:
>
>
> On 12/03/18 15:29, Thomas Stüfe wrote:
>
>> Hi Leo,
>>
>> This seems weird.
>>
>> This would affect numerous open() calls, not just this GC log, I cannot
>> imagine the correct fix is to change all of them.
>>
>
> Sorry, I do not understand what you mean with "numerous open()". This fix
> will only affect logging -- or am I missing something? os::open does
> roughly what I try to do in os::fopen_retain.
>
>
Sorry, I spoke unclear. What I meant was I would expect the problem you
found in gc logging to be present for every raw open()/fopen()/CreateFile()
call in the VM and in the JDK, which are quite a few. I wondered why we do
not see more problems like this.
>
>> In fact, on Posix platforms we close all file descriptors except the Pipe
>> ones before between fork() and exec() - see unix/native/libjava/
>> childproc.c.
>>
>
> Yes, that is why my test case did not fail before the fix on unix-like
> systems. I do not know why it is not handled in Windows (possibly a bug,
> possibly to keep old behaviour???), I had planned to ask that as a follow
> up question later, maybe open a bug report if it was not for keeping old
> behaviour. Even though childproc.c does close the file handler, I think it
> is much nicer to open them with FD_CLOEXEC (in addition to let childproc.c
> close it). os::open does so, and I would like to handle ::fopen the same
> way as ::open with a proxy call that ensures that the VM process will
> retain the file descriptor it opens (in HotSpot at least).
>
> Such code is missing on Windows - see windows/native/libjava/ProcessImpl_md.c
>> . There, we do not have fork/exec, but CreateProcess(), and whether we
>> inherit handles or not is controlled via an argument to CreateProcess().
>> But that flag is TRUE, so child processes inherit handles.
>>
>> 331 if (!CreateProcessW(
>> 332 NULL, /* executable name */
>> 333 (LPWSTR)pcmd, /* command line */
>> 334 NULL, /* process security
>> attribute */
>> 335 NULL, /* thread security attribute
>> */
>> 336 TRUE, /* inherits system handles
>> */ <<<<<<
>> 337 processFlag, /* selected based on exe
>> type */
>> 338 (LPVOID)penvBlock,/* environment block */
>> 339 (LPCWSTR)pdir, /* change to the new current
>> directory */
>> 340 &si, /* (in) startup information
>> */
>> 341 &pi)) /* (out) process information
>> */
>> 342 {
>> 343 win32Error(env, L"CreateProcess");
>> 344 }
>>
>> Maybe this is the real error we should fix? Make Windows Runtime.exec
>> behave like the Posix variant by closing all file descriptors upon
>> CreateProcess >
>> (This seems more of a core-libs question.)
>>
>
> I think it is both a core-libs question and a hotspot question. I firmly
> believe we should retain file descriptors with help of FD_CLOEXEC and its
> variants in HotSpot. I am unsure (and have no opinion) what to do in
> core-libs, maybe there is a deeper thought behind line 336?
>
> Some reasons for this:
>
> - if a process is forked using JNI, it would still be good if the hotspot
> descriptors would not leak.
>
> - if (I have no idea if this is true) the behaviour in core-libs can not
> be changed because the behaviour is already wildly (ab)used, this is still
> a correct fix. Remember this will only close file descriptors opened by
> HotSpot code, and at the moment only logging code.
>
> - this will fix the issue in the bug report, and give time for core-libs
> to consider what is correct (and what can be changed without breaking
> applications).
>
> Thanks,
> Leo
>
>
yes, you convinced me.
1 We should fix raw open() calls, because if native code forks via a
different code paths than java Runtime.exec(), we run into the same
problem. Your patch fixes one instance of the problem.
2 And we should fix Windows Runtime.exec() to the same behaviour as on
Posix. I can see this being backward-compatible-problematic, but it
certainly would be the right thing to do. Would love to know what core-libs
says.
Okay, about your change: I dislike that we add a new function, especially a
first class open function, to the os namespace. How about this instead:
since we know that os::open() does the right thing on all platforms, why
can we not just use os::open() instead? Afterwards call fdopen() to wrap a
FILE structure around it, respectively call "FILE* os::open(int fd, const
char* mode)" , which seems to be just a wrapped fdopen(). That way you can
get what you want with less change and without introducing a new API.
Kind Regards, Thomas
>
>> Kind Regards, Thomas
>>
>>
>> On Mon, Mar 12, 2018 at 2:20 PM, Leo Korinth <leo.korinth at oracle.com
>> <mailto:leo.korinth at oracle.com>> wrote:
>>
>> Hi,
>>
>> This fix is for all operating systems though the problem only seams
>> to appear on windows.
>>
>> I am creating a proxy function for fopen (os::fopen_retain) that
>> appends the non-standard "e" mode for linux and bsds. For windows
>> the "N" mode is used. For other operating systems, I assume that I
>> can use fcntl F_SETFD FD_CLOEXEC. I think this will work for AIX,
>> Solaris and other operating systems that do not support the "e"
>> flag. Feedback otherwise please!
>>
>> The reason that I use the mode "e" and not only fcntl for linux and
>> bsds is threefold. First, I still need to use mode flags on windows
>> as it does not support fcntl. Second, I probably save a system call.
>> Third, the change will be applied directly, and there will be no
>> point in time (between system calls) when the process can leak the
>> file descriptor, so it is safer.
>>
>> The test case forks three VMs in a row. By doing so we know that the
>> second VM is opened with a specific log file. The third VM should
>> have less open file descriptors (as it is does not use logging)
>> which is checked using a UnixOperatingSystemMXBean. This is not
>> possible on windows, so on windows I try to rename the file, which
>> will not work if the file is opened (the actual reason the bug was
>> opened).
>>
>> The added test case shows that the bug fix closes the log file on
>> windows. The VM on other operating systems closed the log file even
>> before the fix.
>>
>> Maybe the test case should be moved to a different path?
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8176717
>> <https://bugs.openjdk.java.net/browse/JDK-8176717>
>> https://bugs.openjdk.java.net/browse/JDK-8176809
>> <https://bugs.openjdk.java.net/browse/JDK-8176809>
>>
>> Webrev:
>> http://cr.openjdk.java.net/~lkorinth/8176717/00/
>> <http://cr.openjdk.java.net/~lkorinth/8176717/00/>
>>
>> Testing:
>> hs-tier1, hs-tier2 and TestInheritFD.java
>> (on 64-bit linux, solaris, windows and mac)
>>
>> Thanks,
>> Leo
>>
>>
>>
More information about the hotspot-dev
mailing list