RFR (M/L): 8131168: Refactor ProcessHandleImpl_*.c and add implememtation for AIX

Tue Aug 11 15:52:14 UTC 2015

On Tue, Aug 11, 2015 at 5:08 PM, Roger Riggs <Roger.Riggs at oracle.com> wrote:
> Hi Volker,
>
> Thanks for checking into the details of the OS X sysctl.  I'm fine with the
> current implementation.
>
> The rest of the updates and the additional tests look fine also.
>

Phew! I was already afraid I would have to switch to double-digit
versions for my webrevs :)

> But I need to check on the CCC status.
>

OK, please let me know once it is ready/approved.

Regards,
Volker

> Thanks, Roger
>
>
>
>
> On 8/10/15 10:13 AM, Volker Simonis wrote:
>
> On Fri, Aug 7, 2015 at 8:54 PM, Roger Riggs <Roger.Riggs at oracle.com> wrote:
>
> Hi Volker,
>
> 1) ProcessHandle.java:243
>
> For the definition of the new commandLine method I would:
>
>   - Use @link instead of @code for references to commands() and arguments()
> for easy navigation
>
>   - @implNote[1] should I think be changed to @apiNote:
>   the text describes not the JDK implementation but is information about the
>   information returned and is use to the application developer, not the JDK
> implementation.
>
>   -  The specific references to Linux implementation command line length
> parameters seems out of place
>       and should be omitted.
>
>         /**
>          * Returns the command line of the process.
>          * <p>
>          * If {@link #command command()} and  {@link #arguments arguments()}
> return non-null
>          * optionals, this is simply a convenience method which concatenates
>          * the values of the two functions separated by spaces. Otherwise it
> will return a
>          * best-effort, platform dependent representation of the command
> line.
>          *
>          * @apiNote Note that the returned executable pathname and the
>          *           arguments may be truncated on some platforms due to
> system
>          *           limitations.
>          *           <p>
>          *           The executable pathname may contain only the
>          *           name of the executable without the full path
> information.
>          *           It is undecideable whether white space separates
> different
>          *           arguments or is part of a single argument.
>          *
>          * @return an {@code Optional<String>} of the command line
>          *         of the process
>          */
>         public Optional<String> commandLine();
>
>
> ProcessHandle.java:252: in arguments() method - @apiNote is a better fit for
> the note
>
> ProcessHandleImpl_macosx.c:276:   - indentation +4
>
> Thanks for the correction. I've taken your wording you suggested.
>
> 2) ProcessHandleImpl_macosx.c:192:
>     if (errno != EINVAL) ...  There was previously this test,  I'm concerned
> that if the pid is invalid,
>     it will now throw a RuntimeException instead of returning -1.
>     I recall a discussion from May that recommended testing for EINVAL.
>     The sysctl in getCmdlineAndUserInfo also does not throw if errno !=
> EINVAL, so the usage
>     is not consistent (probably my coding) but needs investigation.
>
> Not sure about this one and couldn't find any previous discussion
> about the topic.
>
> But, according to the sysctl man-page, EINVAL is only returned if:
>  - The name array is less than two or greater than CTL_MAXNAME.
>  - A non-null newp is given and its specified length in newlen is too
> large or too small.
>
> The first case can not happen because we always statically allocate
> arrays of the correct size.
> The second case can not happen as well, because we always have 'newp ==
> NULL'.
>
> So according to this information I don't see any reason why we should
> check for EINVAL. I think the right solution is to check for 'oldlenp
>
> 0' which we already do. By the way, this is also the check applied
>
> by the psutils (see the implementation of 'get_kinfo_proc()' in [1]).
>
> So I wnated to also removed the last check for EINVAL in
> getCmdlineAndUserInfo(). But for some reason, that seems to be really
> necessary. Without it, we will get a RuntinmeException if we call
> sysinfo for pid==0 for example. Further research showed that the
> kernel seems to really return EINVAL for KERN_PROCARGS2 (see function
> sysctl_procargsx() in [2]). But KERN_PROCARGS2 isn't specified as a
> supported constant in the sysctl man-page, so the information there is
> still valid :)
>
> On the other hand, I found that the psutils alo handles EINVAL only
> for KERN_PROCARGS2 (see get_arg_list() in [1]).
>
> So to cut a long story short, I think the current implementation is
> safe as it is now!
>
> [1] http://psutil.googlecode.com/svn/trunk/psutil/arch/osx/process_info.c
> [2]
> http://www.opensource.apple.com/source/xnu/xnu-1699.24.23/bsd/kern/kern_sysctl.c
>
> 3) ProcessHandleImpl_solaris.c can do without the includes:
> #include "jni_util.h"
> #include "java_lang_ProcessHandleImpl.h"
> #include "java_lang_ProcessHandleImpl_Info.h"
>
> #include <stdio.h>
>
> 4) Ditto ProcessHandleImpl_aix.c
>
> Thanks, fixed.
>
> 5) ProcessHandleImpl_unix.c: 618:  typo  "fuctions" -> "functions"
>
> Fixed.
>
> 6) ProcessHandleImpl_unix.c:463: Would it be worthwhile to put
> sysconf(_SC_GETPW_R_SIZE_MAX) in a static?
>
> Good point! While doing this I realized that 'clock_ticks_per_second'
> is only used on Linux. So I moved the declaration of
> 'clock_ticks_per_second' from ProcessHandleImpl_unix.hpp to
> ProcessHandleImpl_linux.cpp and its initialization to  os_initNative()
> in the same file.
>
> I also declare a new static 'getpw_buf_size' in
> ProcessHandleImpl_unix.cpp and initialize it in
> Java_java_lang_ProcessHandleImpl_initNative() in the same file.
>
> 7) OnExitTest.java: exits without an error, just output in the log; it would
> escape attention.
>     The code respects the timeout setting.
>     Suggest removing the 'return' @133; the test will produce some errors
> and when debugging
>     the note will be in the log.
>
> Done.
>
> 8) ProcessHandleImpl_unix.h:58 - update the comment to include 0 as a valid
> parent pid.
>
> Fixed.
>
> Looks pretty good,
> Thanks, Roger
>
> Thanks:)
>
> I've also added two rudimentary tests for the new commandLine()
> function to InfoTest.test2():
>
>  - if both, 'command()' and 'arguments()' are available, I check that
> 'commandLine()' starts with 'command()' and contains all the arguments
> from 'arguments()'.
>
>  - otherwise, if 'commandLine()' is available, I check that it at
> least contains 'java'.  And as long as it is big enough it should also
> contain the corresponding arguments.
>
> The two new tests have been verified to pass on Windows, Linux, MacOS
> X, Solaris and AIX.
>
> The new version can be found here:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v6/
>
> Regards,
> Volker
>
>
> [1] http://openjdk.java.net/jeps/8068562  - JEP draft: javadoc tags to
> distinguish API, implementation, specification, and notes
>
>
> On 8/5/2015 3:56 PM, Volker Simonis wrote:
>
> Hi,
>
> so here's the webrev which implements the new Info.commandLine()
> method (I chose 'commandLine() ' instead of 'cmdline()' or
> 'commandline()' because the other getters are camel case as well):
>
> http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v4/
>
> From the JavaDoc of the new method:
>
> * If {@code command()} and  {@code arguments()} return non-null
> * optionals, this is simply a convenience method which concatenates
> * the values of the two functions. Otherwise it will return a
> * best-effort, platform dependent representation of the command line.
>
> This didn't change anything on MacOS X where there is no additional
> effort to get the command line.
>
> On Solaris and AIX, Info.commandLine() will always return the contents
> of psinfo.pr_psargs because there's no other method to get the exact
> arguments (i.e. Info.arguments() always returns NULL. I could
> therefore remove the extra handling of AIX/Solaris in the InfoTest
> from my initial change.
>
> On Linux, things are a little more complicated:
>
> - the initial implementation for Linux used arg[0] as 'command' if
> /proc/pid/exe wasn't readable. This was true for all the processes we
> don't own (except if we are running as root). But this information may
> be incomplete, because arg[0] may only contain the command without its
> full path. One example is 'sendmail' for which Info.command() reported
> "sendmail: accepting connections" but Info.arguments() was empty. This
> is wrong, because 'sendmail' changes its argv[] array. I have
> therefore disabled this behavior now that we have the 'commandLine()'
> method.
>
> - /proc/pid/cmdline is limited to PAGE_SIZE (normally 4096) characters
> on Linux. So strictly speaking, this isn't 'exact' information as well
> (there are plenty of complains that especially for Java programs this
> is not enough) and should go to 'commandLine()' instead to 'arguments'
> if /proc/pid/cmdline is truncated. I do check for this now.
>
> - the information in /proc/pid/cmdline can also be changed to
> something other than the original arguments if a program changes
> argv[] (which is not forbidden) but there's probably not much we can
> do to detect this. I've added a corresponding @implNote comment to
> JavaDoc of Info.arguments().
>
> - the initial implementation did not check for incomplete reads of
> /proc/pid/cmdline. This may be a problem on systems with PAGE_SIZE >
> 4096 (on Linux/ppc64 a page size of 65536 is not unusual). I'm now
> always reading the complete contents of /proc/pid/cmdline.
>
> - as far as I understand the current implementation, 'arguments()'
> returns the arguments array WITHOUT 'arg[0]' (which is the program
> name) but may
> be we should specify that more clearly in the JavaDoc of 'arguments()'.
>
> That's it. Hope you like it :)
>
> Regards,
> Volker
>
> On Fri, Jul 31, 2015 at 11:02 PM, Stuart Marks <stuart.marks at oracle.com>
> wrote:
>
> Hi Roger, Volker,
>
> Glad to see you guys are receptive to this and that it can move forward. Let
> me know if you'd like me to help out, for example with reviews or something.
>
> s'marks
>
>
> On 7/31/15 9:55 AM, Roger Riggs wrote:
>
> Hi Volker,
>
> I agree that adding an Info.commandline() method would be a good way
> to make the command line available and be able to describe that it is
> OS dependent and may be truncated.
> And having it assemble the command and arguments when they are available
> makes
> sense.
> As an API addition it will need a clear spec and I can run it through CCC
> so it
> gets
> another review and compatibility tests.
>
> Thanks, Roger
>
>
>
>
> On 7/31/2015 5:03 AM, Volker Simonis wrote:
>
> On Fri, Jul 31, 2015 at 2:51 AM, Stuart Marks <stuart.marks at oracle.com>
> wrote:
>
> On 7/29/15 11:36 AM, Volker Simonis wrote:
>
> !! ProcessHandleImpl_unix: 709-738:  I still disagree with returning
> truncated or incomplete
> values for the executable or arguments.
> Can anyone be a tie-breaker  (with a good rational and suggestion for
> how
> an
> application can use the data).
>
> As I wrote, I agree to hear other opinions here.
>
> All I want to say that this truncated data is actually what 'ps' is
> reporting on Solaris since decades and people seem to be happy with
> it. As use case I can imagine logging or monitoring (something like
> 'top' in Java) where this data will be just good enough.
>
> We could specially mark possibly incomplete data by extending the Info
> object with functions like commandIsExact() or argumentsIsPrecise().
> But notice that we can not statically preset these values per
> platform. For example on Solaris, the 'command()' would return a
> precise value for processes with the same uid like the VM but only
> inaccurate values for all other processes. The "arguments()" would be
> always inaccurate on Solaris/AIX.
>
> It seems like there are cases where either exact or only approximate
> information is available. And as you observed, you might get one or the
> other on the same platform, depending on the UID. It also might depend
> on
> process state; I believe that some information becomes inaccessible when
> the
> process enters the zombie state.
>
> I don't think we should simply ignore one case or the other, but I also
> don't think we should try to cram the different information into the
> same
> API.
>
> The current ProcessHandle.Info api has
>
>      Optional<String> command()
>      Optional<String[]> arguments()
>
> It sounds to me like Roger wants these to contain only exact
> information.
> That seems reasonable, and this probably needs to be specified more
> clearly
> to contrast with what's below.
>
> On Solaris, the psinfo_t struct has char pr_psargs[PRARGSZ] which is a
> single string which appears to be the concatenation of the arguments
> (maybe
> including the command name). It's also truncated to fit PRARGSZ. It
> doesn't
> make sense to me to try to return this as a String[], as the zeroth
> element
> of that array, and certainly not parsed out into "words". So maybe
> instead
> we should have a different API that returns an imprecise command line as
> a
> single string:
>
>      Optional<String> cmdline()
>
> (Naming bikeshed TBS). The semantics would be that this is the process'
> command and arguments concatenated into a single string (thus
> potentially
> losing argument boundaries) and also possibly truncated based on
> platform
> (COUGHsolarisCOUGH) limitations. It's certainly useful for printing out
> in a
> ps, top, or Activity Monitor style application, for informational
> purposes.
>
> If this were implemented, then on Solaris, command() and arguments()
> would
> always return empty optionals.
>
> I'm not sure what this should be if the exact information is available.
> It
> would be inconvenient if something that just wanted to print out an
> approximate command line had to check several different APIs to get the
> information. Maybe cmdline() could assemble the information from exact
> data
> if it's is available, by concatenating the Strings from command() and
> arguments(), as a convenience to the caller. But I could go either way
> on
> this.
>
> Not sure this counts as a tie-breaker, but it might be a reasonable way
> forward.
>
> s'marks
>
> Hi Stuart,
>
> thanks a lot for your comments - I like your proposal. For me this
> sounds like a good compromise.
>
> @Roger: should I go and add a new field commandLine and the
> corresponding getter to the Info class? As Stuart proposed, the getter
> could check if 'command' and 'arguments' are available and assemble
> the command line from them. Otherwise it could use the content of the
> commandLine field if that is available.
>
> Regards,
> Volker
>
>
>