RFR: 8324668: JDWP process management needs more efficient file descriptor handling [v4]

Jaikiran Pai jpai at openjdk.org
Tue Jan 30 16:17:59 UTC 2024


On Tue, 30 Jan 2024 06:52:49 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> src/jdk.jdwp.agent/unix/native/libjdwp/exec_md.c line 129:
>> 
>>> 127:         /* Find max allowed file descriptors for a process
>>> 128:          * and assume all were opened for the parent process and
>>> 129:          * copied over to this child process. We close them all. */
>> 
>> I'm somewhat leaning towards this just being a fatal error. Why would it ever fail? Do we think the version in childproc.c that also does this is doing so for a good reason, or perhaps just like jdwp it used to use the slow version and then kept it around as a backup when the faster version was introduced.
>
> I think we are sliding into " the perfect is the enemy of the good" here. If you want to investigate the resilience of closing FD's then I suggest you look at that in a different PR and let Jai integrate this optimization to avoid the timeouts that were seen.

Hello Chris, I had a look at the ProcessImpl implementation to see why that fallback is in place. Looking at the ProcessImpl implementation, it has "launch mechanisms" when launching a child process. It is configurable through a system property.
By default, when not configured, the ProcessImpl implementation uses posix_spawn() instead of fork(). So in its current form on macos (given the context of this PR), when the ProcessBuilder java API gets used to start child processes, the ProcessImpl's implementation ends up using posix_spawn() and thus this entire file descriptor closing in childproc.c never gets called - neither the closeDescriptors() nor the fallback of closing OPEN_MAX number of descriptors. (I won't go into implementation details of what ProcessImpl does when posix_spawn() is used, because after looking at it, I have more questions than answers and I think discussing those here is going to cause distraction).

As for this specific if block, the only time we enter here is if we can't `opendir()` the platform specific directory containing the open file descriptors. Whether or not it realistically happens, I'm unsure. I think having this fallback is perhaps a better option than just failing in such cases - this fallback when activated will certainly "slowdown" the launch of the arbitrary command but when no timeouts are in place, the arbritary command would indeed get launched.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17588#discussion_r1471506065


More information about the serviceability-dev mailing list