RFR (M/L): 8131168: Refactor ProcessHandleImpl_*.c and add implememtation for AIX

Roger Riggs Roger.Riggs at Oracle.com
Fri Aug 14 12:40:42 UTC 2015


Thanks Volker. Have a good vacation.

On 8/14/15 2:48 AM, Volker Simonis wrote:
> On Fri, Aug 14, 2015 at 3:24 AM, Roger Riggs <Roger.Riggs at oracle.com> wrote:
>> Hi Volker,
>>
>> The internal reviews are complete and accepted.
>>
>> The code looks fine.  Please push it.  I"ll be going on vacation so if it
>> breaks something,
>> someone else will need to pick up the pieces or back it out.
>>
> Great - I've just pushed the code!
>
> The only minor addition I've done was to add the new commandLine field
> to ProcessHandlImpl_win.c as well. It is currently not used, but for
> the sake of completeness I think it's more consistent now.
>
> Before pushing I've synced and build on Linux, Solaris, MacOS X, AIX
> and Windows. I've also rerun the ProcessHandle jtreg tests and they
> all passed. So hopefully nothing will break and there will be no
> reason for backing the change because I am already on vacation :)
>
> Thanks for your feedback and help and have a good vacation!
> Volker
>
>> Many thanks for the work to refactor and cleanup the details.
>>
>> Roger
>>
>>
>>
>> On 8/12/15 8:10 AM, Volker Simonis wrote:
>>
>> Hi Stuart,
>>
>> thanks for verifying the changes one more time.
>>
>> On Wed, Aug 12, 2015 at 12:19 AM, Stuart Marks <stuart.marks at oracle.com>
>> wrote:
>>
>> .. and of course right after I sent my previous message, I ran across
>> something worth noting.
>>
>> The proposed spec for commandLine() says,
>>
>> * If {@link #command command()} and  {@link #arguments arguments()} return
>> non-null
>> * optionals,
>>
>> The preferred term is "non-empty" instead of non-null. This is kind of
>> nitpicky but in fact command() and arguments() should NEVER return an actual
>> null; they should always return an Optional that is either empty or that has
>> a value. So I think this is important to change lest someone be misled into
>> writing
>>
>>      if (info.command() == null && info.arguments() == null) ...
>>
>> Good point. I've updated the documentation accordingly and created:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v7/
>>
>> Regards,
>> Volker
>>
>> Thanks,
>>
>> s'marks
>>
>>
>>
>> On 8/11/15 3:07 PM, Stuart Marks wrote:
>>
>> Hi Volker,
>>
>> I looked at the proposed specification of commandLine() after the most
>> recent
>> round of reviews (which is 8131168.v6 I believe) and it looks fine to me.
>> It
>> expresses the intent pretty well. Oh, and the name "commandLine" is fine
>> and it
>> fits well with the names of the other methods.
>>
>> Thanks,
>>
>> s'marks
>>
>> On 8/11/15 8:52 AM, Volker Simonis wrote:
>>
>> On Tue, Aug 11, 2015 at 5:08 PM, Roger Riggs <Roger.Riggs at oracle.com>
>> wrote:
>>
>> Hi Volker,
>>
>> Thanks for checking into the details of the OS X sysctl.  I'm fine with
>> the
>> current implementation.
>>
>> The rest of the updates and the additional tests look fine also.
>>
>> Phew! I was already afraid I would have to switch to double-digit
>> versions for my webrevs :)
>>
>> But I need to check on the CCC status.
>>
>> OK, please let me know once it is ready/approved.
>>
>> Regards,
>> Volker
>>
>> Thanks, Roger
>>
>>
>>
>>
>> On 8/10/15 10:13 AM, Volker Simonis wrote:
>>
>> On Fri, Aug 7, 2015 at 8:54 PM, Roger Riggs <Roger.Riggs at oracle.com>
>> wrote:
>>
>> Hi Volker,
>>
>> 1) ProcessHandle.java:243
>>
>> For the definition of the new commandLine method I would:
>>
>>     - Use @link instead of @code for references to commands() and
>> arguments()
>> for easy navigation
>>
>>     - @implNote[1] should I think be changed to @apiNote:
>>     the text describes not the JDK implementation but is information
>> about the
>>     information returned and is use to the application developer, not the
>> JDK
>> implementation.
>>
>>     -  The specific references to Linux implementation command line
>> length
>> parameters seems out of place
>>         and should be omitted.
>>
>>           /**
>>            * Returns the command line of the process.
>>            * <p>
>>            * If {@link #command command()} and  {@link #arguments
>> arguments()}
>> return non-null
>>            * optionals, this is simply a convenience method which
>> concatenates
>>            * the values of the two functions separated by spaces.
>> Otherwise it
>> will return a
>>            * best-effort, platform dependent representation of the
>> command
>> line.
>>            *
>>            * @apiNote Note that the returned executable pathname and the
>>            *           arguments may be truncated on some platforms due
>> to
>> system
>>            *           limitations.
>>            *           <p>
>>            *           The executable pathname may contain only the
>>            *           name of the executable without the full path
>> information.
>>            *           It is undecideable whether white space separates
>> different
>>            *           arguments or is part of a single argument.
>>            *
>>            * @return an {@code Optional<String>} of the command line
>>            *         of the process
>>            */
>>           public Optional<String> commandLine();
>>
>>
>> ProcessHandle.java:252: in arguments() method - @apiNote is a better fit
>> for
>> the note
>>
>> ProcessHandleImpl_macosx.c:276:   - indentation +4
>>
>> Thanks for the correction. I've taken your wording you suggested.
>>
>> 2) ProcessHandleImpl_macosx.c:192:
>>       if (errno != EINVAL) ...  There was previously this test,  I'm
>> concerned
>> that if the pid is invalid,
>>       it will now throw a RuntimeException instead of returning -1.
>>       I recall a discussion from May that recommended testing for EINVAL.
>>       The sysctl in getCmdlineAndUserInfo also does not throw if errno !=
>> EINVAL, so the usage
>>       is not consistent (probably my coding) but needs investigation.
>>
>> Not sure about this one and couldn't find any previous discussion
>> about the topic.
>>
>> But, according to the sysctl man-page, EINVAL is only returned if:
>>    - The name array is less than two or greater than CTL_MAXNAME.
>>    - A non-null newp is given and its specified length in newlen is too
>> large or too small.
>>
>> The first case can not happen because we always statically allocate
>> arrays of the correct size.
>> The second case can not happen as well, because we always have 'newp ==
>> NULL'.
>>
>> So according to this information I don't see any reason why we should
>> check for EINVAL. I think the right solution is to check for 'oldlenp
>>
>> 0' which we already do. By the way, this is also the check applied
>>
>> by the psutils (see the implementation of 'get_kinfo_proc()' in [1]).
>>
>> So I wnated to also removed the last check for EINVAL in
>> getCmdlineAndUserInfo(). But for some reason, that seems to be really
>> necessary. Without it, we will get a RuntinmeException if we call
>> sysinfo for pid==0 for example. Further research showed that the
>> kernel seems to really return EINVAL for KERN_PROCARGS2 (see function
>> sysctl_procargsx() in [2]). But KERN_PROCARGS2 isn't specified as a
>> supported constant in the sysctl man-page, so the information there is
>> still valid :)
>>
>> On the other hand, I found that the psutils alo handles EINVAL only
>> for KERN_PROCARGS2 (see get_arg_list() in [1]).
>>
>> So to cut a long story short, I think the current implementation is
>> safe as it is now!
>>
>> [1]
>> http://psutil.googlecode.com/svn/trunk/psutil/arch/osx/process_info.c
>> [2]
>>
>> http://www.opensource.apple.com/source/xnu/xnu-1699.24.23/bsd/kern/kern_sysctl.c
>>
>> 3) ProcessHandleImpl_solaris.c can do without the includes:
>> #include "jni_util.h"
>> #include "java_lang_ProcessHandleImpl.h"
>> #include "java_lang_ProcessHandleImpl_Info.h"
>>
>> #include <stdio.h>
>>
>> 4) Ditto ProcessHandleImpl_aix.c
>>
>> Thanks, fixed.
>>
>> 5) ProcessHandleImpl_unix.c: 618:  typo  "fuctions" -> "functions"
>>
>> Fixed.
>>
>> 6) ProcessHandleImpl_unix.c:463: Would it be worthwhile to put
>> sysconf(_SC_GETPW_R_SIZE_MAX) in a static?
>>
>> Good point! While doing this I realized that 'clock_ticks_per_second'
>> is only used on Linux. So I moved the declaration of
>> 'clock_ticks_per_second' from ProcessHandleImpl_unix.hpp to
>> ProcessHandleImpl_linux.cpp and its initialization to  os_initNative()
>> in the same file.
>>
>> I also declare a new static 'getpw_buf_size' in
>> ProcessHandleImpl_unix.cpp and initialize it in
>> Java_java_lang_ProcessHandleImpl_initNative() in the same file.
>>
>> 7) OnExitTest.java: exits without an error, just output in the log; it
>> would
>> escape attention.
>>       The code respects the timeout setting.
>>       Suggest removing the 'return' @133; the test will produce some
>> errors
>> and when debugging
>>       the note will be in the log.
>>
>> Done.
>>
>> 8) ProcessHandleImpl_unix.h:58 - update the comment to include 0 as a
>> valid
>> parent pid.
>>
>> Fixed.
>>
>> Looks pretty good,
>> Thanks, Roger
>>
>> Thanks:)
>>
>> I've also added two rudimentary tests for the new commandLine()
>> function to InfoTest.test2():
>>
>>    - if both, 'command()' and 'arguments()' are available, I check that
>> 'commandLine()' starts with 'command()' and contains all the arguments
>> from 'arguments()'.
>>
>>    - otherwise, if 'commandLine()' is available, I check that it at
>> least contains 'java'.  And as long as it is big enough it should also
>> contain the corresponding arguments.
>>
>> The two new tests have been verified to pass on Windows, Linux, MacOS
>> X, Solaris and AIX.
>>
>> The new version can be found here:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v6/
>>
>> Regards,
>> Volker
>>
>>
>> [1] http://openjdk.java.net/jeps/8068562  - JEP draft: javadoc tags to
>> distinguish API, implementation, specification, and notes
>>
>>
>> On 8/5/2015 3:56 PM, Volker Simonis wrote:
>>
>> Hi,
>>
>> so here's the webrev which implements the new Info.commandLine()
>> method (I chose 'commandLine() ' instead of 'cmdline()' or
>> 'commandline()' because the other getters are camel case as well):
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v4/
>>
>>   From the JavaDoc of the new method:
>>
>> * If {@code command()} and  {@code arguments()} return non-null
>> * optionals, this is simply a convenience method which concatenates
>> * the values of the two functions. Otherwise it will return a
>> * best-effort, platform dependent representation of the command line.
>>
>> This didn't change anything on MacOS X where there is no additional
>> effort to get the command line.
>>
>> On Solaris and AIX, Info.commandLine() will always return the contents
>> of psinfo.pr_psargs because there's no other method to get the exact
>> arguments (i.e. Info.arguments() always returns NULL. I could
>> therefore remove the extra handling of AIX/Solaris in the InfoTest
>> from my initial change.
>>
>> On Linux, things are a little more complicated:
>>
>> - the initial implementation for Linux used arg[0] as 'command' if
>> /proc/pid/exe wasn't readable. This was true for all the processes we
>> don't own (except if we are running as root). But this information may
>> be incomplete, because arg[0] may only contain the command without its
>> full path. One example is 'sendmail' for which Info.command() reported
>> "sendmail: accepting connections" but Info.arguments() was empty. This
>> is wrong, because 'sendmail' changes its argv[] array. I have
>> therefore disabled this behavior now that we have the 'commandLine()'
>> method.
>>
>> - /proc/pid/cmdline is limited to PAGE_SIZE (normally 4096) characters
>> on Linux. So strictly speaking, this isn't 'exact' information as well
>> (there are plenty of complains that especially for Java programs this
>> is not enough) and should go to 'commandLine()' instead to 'arguments'
>> if /proc/pid/cmdline is truncated. I do check for this now.
>>
>> - the information in /proc/pid/cmdline can also be changed to
>> something other than the original arguments if a program changes
>> argv[] (which is not forbidden) but there's probably not much we can
>> do to detect this. I've added a corresponding @implNote comment to
>> JavaDoc of Info.arguments().
>>
>> - the initial implementation did not check for incomplete reads of
>> /proc/pid/cmdline. This may be a problem on systems with PAGE_SIZE >
>> 4096 (on Linux/ppc64 a page size of 65536 is not unusual). I'm now
>> always reading the complete contents of /proc/pid/cmdline.
>>
>> - as far as I understand the current implementation, 'arguments()'
>> returns the arguments array WITHOUT 'arg[0]' (which is the program
>> name) but may
>> be we should specify that more clearly in the JavaDoc of 'arguments()'.
>>
>> That's it. Hope you like it :)
>>
>> Regards,
>> Volker
>>
>> On Fri, Jul 31, 2015 at 11:02 PM, Stuart Marks <stuart.marks at oracle.com>
>> wrote:
>>
>> Hi Roger, Volker,
>>
>> Glad to see you guys are receptive to this and that it can move forward.
>> Let
>> me know if you'd like me to help out, for example with reviews or
>> something.
>>
>> s'marks
>>
>>
>> On 7/31/15 9:55 AM, Roger Riggs wrote:
>>
>> Hi Volker,
>>
>> I agree that adding an Info.commandline() method would be a good way
>> to make the command line available and be able to describe that it is
>> OS dependent and may be truncated.
>> And having it assemble the command and arguments when they are available
>> makes
>> sense.
>> As an API addition it will need a clear spec and I can run it through
>> CCC
>> so it
>> gets
>> another review and compatibility tests.
>>
>> Thanks, Roger
>>
>>
>>
>>
>> On 7/31/2015 5:03 AM, Volker Simonis wrote:
>>
>> On Fri, Jul 31, 2015 at 2:51 AM, Stuart Marks <stuart.marks at oracle.com>
>> wrote:
>>
>> On 7/29/15 11:36 AM, Volker Simonis wrote:
>>
>> !! ProcessHandleImpl_unix: 709-738:  I still disagree with returning
>> truncated or incomplete
>> values for the executable or arguments.
>> Can anyone be a tie-breaker  (with a good rational and suggestion for
>> how
>> an
>> application can use the data).
>>
>> As I wrote, I agree to hear other opinions here.
>>
>> All I want to say that this truncated data is actually what 'ps' is
>> reporting on Solaris since decades and people seem to be happy with
>> it. As use case I can imagine logging or monitoring (something like
>> 'top' in Java) where this data will be just good enough.
>>
>> We could specially mark possibly incomplete data by extending the Info
>> object with functions like commandIsExact() or argumentsIsPrecise().
>> But notice that we can not statically preset these values per
>> platform. For example on Solaris, the 'command()' would return a
>> precise value for processes with the same uid like the VM but only
>> inaccurate values for all other processes. The "arguments()" would be
>> always inaccurate on Solaris/AIX.
>>
>> It seems like there are cases where either exact or only approximate
>> information is available. And as you observed, you might get one or the
>> other on the same platform, depending on the UID. It also might depend
>> on
>> process state; I believe that some information becomes inaccessible when
>> the
>> process enters the zombie state.
>>
>> I don't think we should simply ignore one case or the other, but I also
>> don't think we should try to cram the different information into the
>> same
>> API.
>>
>> The current ProcessHandle.Info api has
>>
>>        Optional<String> command()
>>        Optional<String[]> arguments()
>>
>> It sounds to me like Roger wants these to contain only exact
>> information.
>> That seems reasonable, and this probably needs to be specified more
>> clearly
>> to contrast with what's below.
>>
>> On Solaris, the psinfo_t struct has char pr_psargs[PRARGSZ] which is a
>> single string which appears to be the concatenation of the arguments
>> (maybe
>> including the command name). It's also truncated to fit PRARGSZ. It
>> doesn't
>> make sense to me to try to return this as a String[], as the zeroth
>> element
>> of that array, and certainly not parsed out into "words". So maybe
>> instead
>> we should have a different API that returns an imprecise command line as
>> a
>> single string:
>>
>>        Optional<String> cmdline()
>>
>> (Naming bikeshed TBS). The semantics would be that this is the process'
>> command and arguments concatenated into a single string (thus
>> potentially
>> losing argument boundaries) and also possibly truncated based on
>> platform
>> (COUGHsolarisCOUGH) limitations. It's certainly useful for printing out
>> in a
>> ps, top, or Activity Monitor style application, for informational
>> purposes.
>>
>> If this were implemented, then on Solaris, command() and arguments()
>> would
>> always return empty optionals.
>>
>> I'm not sure what this should be if the exact information is available.
>> It
>> would be inconvenient if something that just wanted to print out an
>> approximate command line had to check several different APIs to get the
>> information. Maybe cmdline() could assemble the information from exact
>> data
>> if it's is available, by concatenating the Strings from command() and
>> arguments(), as a convenience to the caller. But I could go either way
>> on
>> this.
>>
>> Not sure this counts as a tie-breaker, but it might be a reasonable way
>> forward.
>>
>> s'marks
>>
>> Hi Stuart,
>>
>> thanks a lot for your comments - I like your proposal. For me this
>> sounds like a good compromise.
>>
>> @Roger: should I go and add a new field commandLine and the
>> corresponding getter to the Info class? As Stuart proposed, the getter
>> could check if 'command' and 'arguments' are available and assemble
>> the command line from them. Otherwise it could use the content of the
>> commandLine field if that is available.
>>
>> Regards,
>> Volker
>>
>>
>>
>>




More information about the core-libs-dev mailing list