RFR (M/L): 8131168: Refactor ProcessHandleImpl_*.c and add implememtation for AIX
Hi, I worked on refactoring the native ProcessHandleImpl implementation to enable more code sharing and make it easier to add support for new platforms (e.g. AIX) and I think I had a pretty nice version running. But unfortunately I've just realized that the recent changes to ProcessHandleImpl (i.e. "8078099: (process) ProcessHandle should uniquely identify processes" and "8078108: (process) ProcessHandle.isAlive should be robust") have been massive so I have to start over to merge all my changes with the new version. But before doing that I just wanted to post my current changes which cleanly apply to the repo before 8078099 and ask for your opinion: http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/ Here's a summary (taken from ProcesHandleImpl_unix.c) of what I've actually done (if the output appears scrambled in you mail better read it in the webrev http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/src/java.base/un... ): /** * This file contains the implementation of the native ProcessHandleImpl * functions which are common to all Unix variants. * * The currently supported Unix variants are Solaris, Linux, MaxOS X and AIX. * The various similarities and differences between these systems make it hard * to find a clear boundary between platform specific and shared code. * * In order to ease code sharing between the pltforms while still keeping the * code as clean as possible (i.e. free of preprocessor macros) we use the * following source code layout (remember that ProcessHandleImpl_unix.c will * be compiled on EVERY Unix platform while ProcessHandleImpl_<os>.c will be * only compiled on the specific OS): * * - all the JNI wrappers for the ProcessHandleImpl functions go into this file * - if their implementation is common on ALL the supported Unix platforms it * goes right into the JNI wrappers * - if the whole function or substantial parts are platform dependent, the * implementation goes into os_<function_name> functions in * ProcessHandleImpl_<os>.c * - if at least two platforms implement an os_<function_name> function in the * same way, this implementation is factored out into unix_<function_name>, * placed into this file and called from the corresponding os_<function_name> * function. * - For convenience, all the os_ and unix_ functions are declared in * ProcessHandleImpl_unix.h which is included into every * ProcessHandleImpl_<os>.c file. * * Example 1: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_initIDs() * is the same on all platforms except on Linux where it initilizes one * additional field. So we place the implementation right into * Java_java_lang_ProcessHandleImpl_00024Info_initIDs() but add call to * os_init() at the end of the function which is empty on all platforms * except Linux where it performs the additionally initializations. * * Example 2: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_info0 is * the same on Solaris and AIX but different on Linux and MacOSX. We therefore * simply call the two helpers os_getStatInfo() and os_getCmdlineInfo(). The * Linux and MaxOS X versions of these functions (in the corresponding files * ProcessHandleImpl_linux.c and ProcessHandleImpl_macosx.c) directly contain * the platform specific implementations while the Solaris and AIX * implementations simply call back to unix_getStatInfo() and * unix_getCmdlineInfo() which are implemented right in this file. * * The term "same implementation" is still a question of interpretation. It my * be acceptable to have a few ifdef'ed lines if that allows the sharing of a * huge function. On the other hand, if the platform specific code in a shared * function grows over a certain limit, it may be better to refactor that * functionality into corresponding, platform-specific os_ functions. */ This resulted in the new file ProcessHandleImpl_linux.c which now only contains Linux-only code which was previously in ifdefed for Linux in ProcessHandleImpl_linux.c. The advantage is that we now have only one version of: Java_java_lang_ProcessHandleImpl_00024Info_initIDs unix_fillArgArray unix_uidToUser and on version of unix_getChildren which is shared by Solaris, Linux and AIX and one version of unix_getStatInfo unix_getCmdlineInfo which are shared by Solaris and Linux. Additionally I've added the AIX port following the new schema and I've slightly improved the Solaris port to use the information from psinfo.pr_psargs for reporting at least the first 80 characters of the command line arguments. arg[0] from psinfo.pr_psargs is also used as the "command()" string in the case where "/proc/%d/path/a.out" isn't readable. This helps to report a command string for all the processes we do not own, because in that case, "/proc/%d/path/a.out" isn't readable. By the way, using args[0] as a fall-back for cmd beeing null also helps on MacOS X for processes we don't own. Finally I've also done some test improvements. I have tested all my changes on Linux/amd64, Linux/ppc64, MacOS X, Solaris and AIX. I'd be happy to forward port my changes to the newest head version if you generally agree with my approach (and give me a short time frame where you promise not to do any massive changes :) What do you think? Thank you and best regards, Volker
Hi Volker, yes, generally, the approach is an improvement and provides a clearer model of os specific variations. More comments inline... On 7/17/2015 2:28 PM, Volker Simonis wrote:
Hi,
I worked on refactoring the native ProcessHandleImpl implementation to enable more code sharing and make it easier to add support for new platforms (e.g. AIX) and I think I had a pretty nice version running. But unfortunately I've just realized that the recent changes to ProcessHandleImpl (i.e. "8078099: (process) ProcessHandle should uniquely identify processes" and "8078108: (process) ProcessHandle.isAlive should be robust") have been massive so I have to start over to merge all my changes with the new version.
But before doing that I just wanted to post my current changes which cleanly apply to the repo before 8078099 and ask for your opinion:
http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/
Here's a summary (taken from ProcesHandleImpl_unix.c) of what I've actually done (if the output appears scrambled in you mail better read it in the webrev http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/src/java.base/un... ):
/** * This file contains the implementation of the native ProcessHandleImpl * functions which are common to all Unix variants. * * The currently supported Unix variants are Solaris, Linux, MaxOS X and AIX. * The various similarities and differences between these systems make it hard * to find a clear boundary between platform specific and shared code. * * In order to ease code sharing between the pltforms while still keeping the * code as clean as possible (i.e. free of preprocessor macros) we use the * following source code layout (remember that ProcessHandleImpl_unix.c will * be compiled on EVERY Unix platform while ProcessHandleImpl_<os>.c will be * only compiled on the specific OS): * * - all the JNI wrappers for the ProcessHandleImpl functions go into this file * - if their implementation is common on ALL the supported Unix platforms it * goes right into the JNI wrappers * - if the whole function or substantial parts are platform dependent, the * implementation goes into os_<function_name> functions in * ProcessHandleImpl_<os>.c * - if at least two platforms implement an os_<function_name> function in the * same way, this implementation is factored out into unix_<function_name>, * placed into this file and called from the corresponding os_<function_name> * function. * - For convenience, all the os_ and unix_ functions are declared in * ProcessHandleImpl_unix.h which is included into every * ProcessHandleImpl_<os>.c file. * * Example 1: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_initIDs() * is the same on all platforms except on Linux where it initilizes one * additional field. So we place the implementation right into * Java_java_lang_ProcessHandleImpl_00024Info_initIDs() but add call to * os_init() at the end of the function which is empty on all platforms * except Linux where it performs the additionally initializations. * * Example 2: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_info0 is * the same on Solaris and AIX but different on Linux and MacOSX. We therefore * simply call the two helpers os_getStatInfo() and os_getCmdlineInfo(). The * Linux and MaxOS X versions of these functions (in the corresponding files * ProcessHandleImpl_linux.c and ProcessHandleImpl_macosx.c) directly contain * the platform specific implementations while the Solaris and AIX * implementations simply call back to unix_getStatInfo() and * unix_getCmdlineInfo() which are implemented right in this file. * * The term "same implementation" is still a question of interpretation. It my * be acceptable to have a few ifdef'ed lines if that allows the sharing of a * huge function. On the other hand, if the platform specific code in a shared * function grows over a certain limit, it may be better to refactor that * functionality into corresponding, platform-specific os_ functions. */
This resulted in the new file ProcessHandleImpl_linux.c which now only contains Linux-only code which was previously in ifdefed for Linux in ProcessHandleImpl_linux.c. The advantage is that we now have only one version of:
Java_java_lang_ProcessHandleImpl_00024Info_initIDs unix_fillArgArray unix_uidToUser
and on version of
unix_getChildren
which is shared by Solaris, Linux and AIX and one version of
unix_getStatInfo unix_getCmdlineInfo
which are shared by Solaris and Linux.
Additionally I've added the AIX port following the new schema and I've slightly improved the Solaris port to use the information from psinfo.pr_psargs for reporting at least the first 80 characters of the command line arguments. arg[0] from psinfo.pr_psargs is also used as the "command()" string in the case where "/proc/%d/path/a.out" isn't readable. This helps to report a command string for all the processes we do not own, because in that case, "/proc/%d/path/a.out" isn't readable. By the way, using args[0] as a fall-back for cmd beeing null also helps on MacOS X for processes we don't own. Finally I've also done some test improvements. My bias was toward omitting information that was incomplete or not usable programatically. If the commandline is truncated then it is pretty useless to a program, the same with the executable command. If the application has to try to guess what's missing and how to use what is present then it should be omitted from the API. If there is a user present then they can use 'ps' or 'tasklist' and interpret the partial information.
The focus should be on an API that can be used to control/manage a set of applications the run together, usually under the same userid or with root privileges. Comments on the diffs: ProcessHandle.java: - The notes should probably be use @implNote (though I disagree with providing truncated information) ProcessHandleImpl_unix: - I would not have expected to find the Solaris and AIX implementations of getStatInfo and getCmdlineInfo here; but perhaps it depends on how one considers the lineage of the real unix. Since the more common platforms are Linux, Redhat, etc I would keep those versions in the ProcessHandleImpl_unix common file. Thanks, Roger
I have tested all my changes on Linux/amd64, Linux/ppc64, MacOS X, Solaris and AIX.
I'd be happy to forward port my changes to the newest head version if you generally agree with my approach (and give me a short time frame where you promise not to do any massive changes :)
What do you think?
Thank you and best regards, Volker
Hi Roger, thanks for looking at the webrev. Please find my comments inline: On Mon, Jul 20, 2015 at 4:01 PM, Roger Riggs <Roger.Riggs@oracle.com> wrote:
Hi Volker,
yes, generally, the approach is an improvement and provides a clearer model of os specific variations.
More comments inline...
On 7/17/2015 2:28 PM, Volker Simonis wrote:
Hi,
I worked on refactoring the native ProcessHandleImpl implementation to enable more code sharing and make it easier to add support for new platforms (e.g. AIX) and I think I had a pretty nice version running. But unfortunately I've just realized that the recent changes to ProcessHandleImpl (i.e. "8078099: (process) ProcessHandle should uniquely identify processes" and "8078108: (process) ProcessHandle.isAlive should be robust") have been massive so I have to start over to merge all my changes with the new version.
But before doing that I just wanted to post my current changes which cleanly apply to the repo before 8078099 and ask for your opinion:
http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/
Here's a summary (taken from ProcesHandleImpl_unix.c) of what I've actually done (if the output appears scrambled in you mail better read it in the webrev
http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/src/java.base/un... ):
/** * This file contains the implementation of the native ProcessHandleImpl * functions which are common to all Unix variants. * * The currently supported Unix variants are Solaris, Linux, MaxOS X and AIX. * The various similarities and differences between these systems make it hard * to find a clear boundary between platform specific and shared code. * * In order to ease code sharing between the pltforms while still keeping the * code as clean as possible (i.e. free of preprocessor macros) we use the * following source code layout (remember that ProcessHandleImpl_unix.c will * be compiled on EVERY Unix platform while ProcessHandleImpl_<os>.c will be * only compiled on the specific OS): * * - all the JNI wrappers for the ProcessHandleImpl functions go into this file * - if their implementation is common on ALL the supported Unix platforms it * goes right into the JNI wrappers * - if the whole function or substantial parts are platform dependent, the * implementation goes into os_<function_name> functions in * ProcessHandleImpl_<os>.c * - if at least two platforms implement an os_<function_name> function in the * same way, this implementation is factored out into unix_<function_name>, * placed into this file and called from the corresponding os_<function_name> * function. * - For convenience, all the os_ and unix_ functions are declared in * ProcessHandleImpl_unix.h which is included into every * ProcessHandleImpl_<os>.c file. * * Example 1: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_initIDs() * is the same on all platforms except on Linux where it initilizes one * additional field. So we place the implementation right into * Java_java_lang_ProcessHandleImpl_00024Info_initIDs() but add call to * os_init() at the end of the function which is empty on all platforms * except Linux where it performs the additionally initializations. * * Example 2: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_info0 is * the same on Solaris and AIX but different on Linux and MacOSX. We therefore * simply call the two helpers os_getStatInfo() and os_getCmdlineInfo(). The * Linux and MaxOS X versions of these functions (in the corresponding files * ProcessHandleImpl_linux.c and ProcessHandleImpl_macosx.c) directly contain * the platform specific implementations while the Solaris and AIX * implementations simply call back to unix_getStatInfo() and * unix_getCmdlineInfo() which are implemented right in this file. * * The term "same implementation" is still a question of interpretation. It my * be acceptable to have a few ifdef'ed lines if that allows the sharing of a * huge function. On the other hand, if the platform specific code in a shared * function grows over a certain limit, it may be better to refactor that * functionality into corresponding, platform-specific os_ functions. */
This resulted in the new file ProcessHandleImpl_linux.c which now only contains Linux-only code which was previously in ifdefed for Linux in ProcessHandleImpl_linux.c. The advantage is that we now have only one version of:
Java_java_lang_ProcessHandleImpl_00024Info_initIDs unix_fillArgArray unix_uidToUser
and on version of
unix_getChildren
which is shared by Solaris, Linux and AIX and one version of
unix_getStatInfo unix_getCmdlineInfo
which are shared by Solaris and Linux.
Additionally I've added the AIX port following the new schema and I've slightly improved the Solaris port to use the information from psinfo.pr_psargs for reporting at least the first 80 characters of the command line arguments. arg[0] from psinfo.pr_psargs is also used as the "command()" string in the case where "/proc/%d/path/a.out" isn't readable. This helps to report a command string for all the processes we do not own, because in that case, "/proc/%d/path/a.out" isn't readable. By the way, using args[0] as a fall-back for cmd beeing null also helps on MacOS X for processes we don't own. Finally I've also done some test improvements.
My bias was toward omitting information that was incomplete or not usable programatically. If the commandline is truncated then it is pretty useless to a program, the same with the executable command. If the application has to try to guess what's missing and how to use what is present then it should be omitted from the API. If there is a user present then they can use 'ps' or 'tasklist' and interpret the partial information.
But 'ps' is limited to 80 characters for arguments and command line as well on Solaris as far as I know. Moreover, on AIX I haven't found something better than this for the command line at all. Anyway we have to live with the fact that this information is extremely platform dependent. The least common divisor is surely the PID but for this API I prefer to expose "as much as possible" for a given platform. I suspect that once this API is out there in the world, people will use it for all kind of stuff like monitoring or logging and I'd rather prefer to see a truncated command line in a log file instad of just seeing a plain PID. So usefull/useless depends pretty much on the use case :)
The focus should be on an API that can be used to control/manage a set of applications the run together, usually under the same userid or with root privileges.
Comments on the diffs:
ProcessHandle.java: - The notes should probably be use @implNote (though I disagree with providing truncated information)
Didn't knew that one but I'll use it. Thanks for the hint.
ProcessHandleImpl_unix: - I would not have expected to find the Solaris and AIX implementations of getStatInfo and getCmdlineInfo here; but perhaps it depends on how one considers the lineage of the real unix. Since the more common platforms are Linux, Redhat, etc I would keep those versions in the ProcessHandleImpl_unix common file.
As I wrote in the summary - I don't think the _unix file should contain the implementations of the most prominent Unix platform but rather the implementations shared between most platforms. If a function implementation is only being used on Linux I think it is reasonable to put it into a _linux file. A while ago we discussed if the 'unix' directory should be named 'unix' or 'posix' or whatsoever and ended up with 'unix'. I think that's a good decision because it honours the fact that there is no single, true, genuine Unix variant. The 'unix' directory is a pragmatic approach to share "common" Unix code with "common" in the sense of "supported by at least two or more Unix variants". At least that was my understanding. I'll start to merge in your new changes into my schema and post a new webrev once I'm done. Thanks once again for your comments, Volker
Thanks, Roger
I have tested all my changes on Linux/amd64, Linux/ppc64, MacOS X, Solaris and AIX.
I'd be happy to forward port my changes to the newest head version if you generally agree with my approach (and give me a short time frame where you promise not to do any massive changes :)
What do you think?
Thank you and best regards, Volker
Hi Volker, yes, press ahead. On the _unix topic, I think the most platforms depends on when/how/what you count. I was following the prior pattern that counted OS X as being mostly common with Linux and Oracle Linux as a Redhat derivative. That counts at least as 2 I think. I'd be interested in other folks views on the usefulness of partial/truncated info from the API. If the values are so weak they are only useful for logging then that's not good enough. Perhaps the truncated information should be presented from a different method from ProcessHandle.Info object to be able to reflect its poor quality. Thanks, Roger On 7/20/2015 10:46 AM, Volker Simonis wrote:
Hi Roger,
thanks for looking at the webrev. Please find my comments inline:
On Mon, Jul 20, 2015 at 4:01 PM, Roger Riggs <Roger.Riggs@oracle.com> wrote:
Hi Volker,
yes, generally, the approach is an improvement and provides a clearer model of os specific variations.
More comments inline...
On 7/17/2015 2:28 PM, Volker Simonis wrote:
Hi,
I worked on refactoring the native ProcessHandleImpl implementation to enable more code sharing and make it easier to add support for new platforms (e.g. AIX) and I think I had a pretty nice version running. But unfortunately I've just realized that the recent changes to ProcessHandleImpl (i.e. "8078099: (process) ProcessHandle should uniquely identify processes" and "8078108: (process) ProcessHandle.isAlive should be robust") have been massive so I have to start over to merge all my changes with the new version.
But before doing that I just wanted to post my current changes which cleanly apply to the repo before 8078099 and ask for your opinion:
http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/
Here's a summary (taken from ProcesHandleImpl_unix.c) of what I've actually done (if the output appears scrambled in you mail better read it in the webrev
http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/src/java.base/un... ):
/** * This file contains the implementation of the native ProcessHandleImpl * functions which are common to all Unix variants. * * The currently supported Unix variants are Solaris, Linux, MaxOS X and AIX. * The various similarities and differences between these systems make it hard * to find a clear boundary between platform specific and shared code. * * In order to ease code sharing between the pltforms while still keeping the * code as clean as possible (i.e. free of preprocessor macros) we use the * following source code layout (remember that ProcessHandleImpl_unix.c will * be compiled on EVERY Unix platform while ProcessHandleImpl_<os>.c will be * only compiled on the specific OS): * * - all the JNI wrappers for the ProcessHandleImpl functions go into this file * - if their implementation is common on ALL the supported Unix platforms it * goes right into the JNI wrappers * - if the whole function or substantial parts are platform dependent, the * implementation goes into os_<function_name> functions in * ProcessHandleImpl_<os>.c * - if at least two platforms implement an os_<function_name> function in the * same way, this implementation is factored out into unix_<function_name>, * placed into this file and called from the corresponding os_<function_name> * function. * - For convenience, all the os_ and unix_ functions are declared in * ProcessHandleImpl_unix.h which is included into every * ProcessHandleImpl_<os>.c file. * * Example 1: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_initIDs() * is the same on all platforms except on Linux where it initilizes one * additional field. So we place the implementation right into * Java_java_lang_ProcessHandleImpl_00024Info_initIDs() but add call to * os_init() at the end of the function which is empty on all platforms * except Linux where it performs the additionally initializations. * * Example 2: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_info0 is * the same on Solaris and AIX but different on Linux and MacOSX. We therefore * simply call the two helpers os_getStatInfo() and os_getCmdlineInfo(). The * Linux and MaxOS X versions of these functions (in the corresponding files * ProcessHandleImpl_linux.c and ProcessHandleImpl_macosx.c) directly contain * the platform specific implementations while the Solaris and AIX * implementations simply call back to unix_getStatInfo() and * unix_getCmdlineInfo() which are implemented right in this file. * * The term "same implementation" is still a question of interpretation. It my * be acceptable to have a few ifdef'ed lines if that allows the sharing of a * huge function. On the other hand, if the platform specific code in a shared * function grows over a certain limit, it may be better to refactor that * functionality into corresponding, platform-specific os_ functions. */
This resulted in the new file ProcessHandleImpl_linux.c which now only contains Linux-only code which was previously in ifdefed for Linux in ProcessHandleImpl_linux.c. The advantage is that we now have only one version of:
Java_java_lang_ProcessHandleImpl_00024Info_initIDs unix_fillArgArray unix_uidToUser
and on version of
unix_getChildren
which is shared by Solaris, Linux and AIX and one version of
unix_getStatInfo unix_getCmdlineInfo
which are shared by Solaris and Linux.
Additionally I've added the AIX port following the new schema and I've slightly improved the Solaris port to use the information from psinfo.pr_psargs for reporting at least the first 80 characters of the command line arguments. arg[0] from psinfo.pr_psargs is also used as the "command()" string in the case where "/proc/%d/path/a.out" isn't readable. This helps to report a command string for all the processes we do not own, because in that case, "/proc/%d/path/a.out" isn't readable. By the way, using args[0] as a fall-back for cmd beeing null also helps on MacOS X for processes we don't own. Finally I've also done some test improvements. My bias was toward omitting information that was incomplete or not usable programatically. If the commandline is truncated then it is pretty useless to a program, the same with the executable command. If the application has to try to guess what's missing and how to use what is present then it should be omitted from the API. If there is a user present then they can use 'ps' or 'tasklist' and interpret the partial information.
But 'ps' is limited to 80 characters for arguments and command line as well on Solaris as far as I know. Moreover, on AIX I haven't found something better than this for the command line at all. Anyway we have to live with the fact that this information is extremely platform dependent. The least common divisor is surely the PID but for this API I prefer to expose "as much as possible" for a given platform.
I suspect that once this API is out there in the world, people will use it for all kind of stuff like monitoring or logging and I'd rather prefer to see a truncated command line in a log file instad of just seeing a plain PID. So usefull/useless depends pretty much on the use case :)
The focus should be on an API that can be used to control/manage a set of applications the run together, usually under the same userid or with root privileges.
Comments on the diffs:
ProcessHandle.java: - The notes should probably be use @implNote (though I disagree with providing truncated information)
Didn't knew that one but I'll use it. Thanks for the hint.
ProcessHandleImpl_unix: - I would not have expected to find the Solaris and AIX implementations of getStatInfo and getCmdlineInfo here; but perhaps it depends on how one considers the lineage of the real unix. Since the more common platforms are Linux, Redhat, etc I would keep those versions in the ProcessHandleImpl_unix common file.
As I wrote in the summary - I don't think the _unix file should contain the implementations of the most prominent Unix platform but rather the implementations shared between most platforms. If a function implementation is only being used on Linux I think it is reasonable to put it into a _linux file.
A while ago we discussed if the 'unix' directory should be named 'unix' or 'posix' or whatsoever and ended up with 'unix'. I think that's a good decision because it honours the fact that there is no single, true, genuine Unix variant. The 'unix' directory is a pragmatic approach to share "common" Unix code with "common" in the sense of "supported by at least two or more Unix variants". At least that was my understanding.
I'll start to merge in your new changes into my schema and post a new webrev once I'm done.
Thanks once again for your comments, Volker
Thanks, Roger
I have tested all my changes on Linux/amd64, Linux/ppc64, MacOS X, Solaris and AIX.
I'd be happy to forward port my changes to the newest head version if you generally agree with my approach (and give me a short time frame where you promise not to do any massive changes :)
What do you think?
Thank you and best regards, Volker
Hi, so here comes the new webrev which cleanly applies to the current jdk9-dev/jdk repo: http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v2/ The good thing beforehand - although I added the AIX-port and a big 50-line comment the new version is still 60 lines shorter :) 10 files changed, 829 insertions(+), 999 deletions(-) The main idea behind the new code layout is as follows (see the comment in ProcessHandleImpl_unix.c for more details): The currently supported Unix variants are Solaris, Linux, MaxOS X and AIX. The various similarities and differences between these systems make it hard to find a clear boundary between platform specific and shared code. In order to ease code sharing between the platforms while still keeping the code as clean as possible (i.e. free of preprocessor macros) we use the following source code layout (remember that ProcessHandleImpl_unix.c will be compiled on EVERY Unix platform while ProcessHandleImpl_<os>.c will be only compiled on the specific OS): - all the JNI wrappers for the ProcessHandleImpl functions go into ProcessHandleImpl_unix.c file - if their implementation is common on ALL the supported Unix platforms it goes right into the JNI wrappers - if the whole function or substantial parts of it are platform dependent, the implementation goes into os_<function_name> functions in ProcessHandleImpl_<os>.c - if at least two platforms implement an os_<function_name> function in the same way, this implementation is factored out into unix_<function_name>, placed into ProcessHandleImpl_unix.c and called from the corresponding os_<function_name> function. - For convenience, all the os_ and unix_ functions are declared in ProcessHandleImpl_unix.h which is included into every ProcessHandleImpl_<os>.c file. So ProcessHandleImpl_unix.c only contains code which is shared by at least two platforms. Code which is specific to one single platform goes into the corresponding ProcessHandleImpl_<os>.c file (and this changes adds ProcessHandleImpl_aix.c for AIX-specific code and ProcessHandleImpl_linux.c for Linux-specific code which was in ProcessHandleImpl_unix.c previously). I tried to unify similar functions (e.g. Java_java_lang_ProcessHandleImpl_00024Info_initIDs, Java_java_lang_ProcessHandleImpl_isAlive0 or fillArgArray) into a single instance in ProcessHandleImpl_unix.c. I renamed and unified getStatInfo() into os_getParentPid()/unix_getParentPid() with three implemantions for MacOSX, Linux and Solaris/AIX. I've factored out Java_java_lang_ProcessHandleImpl_getProcessPids0() into os_getChildren()/unix_getChildren() with two implemantions - one for MacOSX and one for Linux/Solaris/AIX. I've tested the new implementation on MacOSX, Linux (x86_64 and ppc64) , Solaris 10/11 (x86_64 and Sparc) and AIX. The final point, which was already mentioned in the first review is the fact that we can not easily get accurate information fort he command line and arguments on AIX. I'd prefer to stay with the current solution which gives "best effort" answers for these attributes but if everybody’s opinion is that such information is useless I can also omit it altogether. I haven’t looked at the other platforms which we have to support (e.g. HPUX, AS/400) but the situation may be similar there. Regards, Volker On Mon, Jul 20, 2015 at 4:56 PM, Roger Riggs <Roger.Riggs@oracle.com> wrote:
Hi Volker,
yes, press ahead.
On the _unix topic, I think the most platforms depends on when/how/what you count. I was following the prior pattern that counted OS X as being mostly common with Linux and Oracle Linux as a Redhat derivative. That counts at least as 2 I think.
I'd be interested in other folks views on the usefulness of partial/truncated info from the API. If the values are so weak they are only useful for logging then that's not good enough.
Perhaps the truncated information should be presented from a different method from ProcessHandle.Info object to be able to reflect its poor quality.
Thanks, Roger
On 7/20/2015 10:46 AM, Volker Simonis wrote:
Hi Roger,
thanks for looking at the webrev. Please find my comments inline:
On Mon, Jul 20, 2015 at 4:01 PM, Roger Riggs <Roger.Riggs@oracle.com> wrote:
Hi Volker,
yes, generally, the approach is an improvement and provides a clearer model of os specific variations.
More comments inline...
On 7/17/2015 2:28 PM, Volker Simonis wrote:
Hi,
I worked on refactoring the native ProcessHandleImpl implementation to enable more code sharing and make it easier to add support for new platforms (e.g. AIX) and I think I had a pretty nice version running. But unfortunately I've just realized that the recent changes to ProcessHandleImpl (i.e. "8078099: (process) ProcessHandle should uniquely identify processes" and "8078108: (process) ProcessHandle.isAlive should be robust") have been massive so I have to start over to merge all my changes with the new version.
But before doing that I just wanted to post my current changes which cleanly apply to the repo before 8078099 and ask for your opinion:
http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/
Here's a summary (taken from ProcesHandleImpl_unix.c) of what I've actually done (if the output appears scrambled in you mail better read it in the webrev
http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/src/java.base/un... ):
/** * This file contains the implementation of the native ProcessHandleImpl * functions which are common to all Unix variants. * * The currently supported Unix variants are Solaris, Linux, MaxOS X and AIX. * The various similarities and differences between these systems make it hard * to find a clear boundary between platform specific and shared code. * * In order to ease code sharing between the pltforms while still keeping the * code as clean as possible (i.e. free of preprocessor macros) we use the * following source code layout (remember that ProcessHandleImpl_unix.c will * be compiled on EVERY Unix platform while ProcessHandleImpl_<os>.c will be * only compiled on the specific OS): * * - all the JNI wrappers for the ProcessHandleImpl functions go into this file * - if their implementation is common on ALL the supported Unix platforms it * goes right into the JNI wrappers * - if the whole function or substantial parts are platform dependent, the * implementation goes into os_<function_name> functions in * ProcessHandleImpl_<os>.c * - if at least two platforms implement an os_<function_name> function in the * same way, this implementation is factored out into unix_<function_name>, * placed into this file and called from the corresponding os_<function_name> * function. * - For convenience, all the os_ and unix_ functions are declared in * ProcessHandleImpl_unix.h which is included into every * ProcessHandleImpl_<os>.c file. * * Example 1: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_initIDs() * is the same on all platforms except on Linux where it initilizes one * additional field. So we place the implementation right into * Java_java_lang_ProcessHandleImpl_00024Info_initIDs() but add call to * os_init() at the end of the function which is empty on all platforms * except Linux where it performs the additionally initializations. * * Example 2: * ---------- * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_info0 is * the same on Solaris and AIX but different on Linux and MacOSX. We therefore * simply call the two helpers os_getStatInfo() and os_getCmdlineInfo(). The * Linux and MaxOS X versions of these functions (in the corresponding files * ProcessHandleImpl_linux.c and ProcessHandleImpl_macosx.c) directly contain * the platform specific implementations while the Solaris and AIX * implementations simply call back to unix_getStatInfo() and * unix_getCmdlineInfo() which are implemented right in this file. * * The term "same implementation" is still a question of interpretation. It my * be acceptable to have a few ifdef'ed lines if that allows the sharing of a * huge function. On the other hand, if the platform specific code in a shared * function grows over a certain limit, it may be better to refactor that * functionality into corresponding, platform-specific os_ functions. */
This resulted in the new file ProcessHandleImpl_linux.c which now only contains Linux-only code which was previously in ifdefed for Linux in ProcessHandleImpl_linux.c. The advantage is that we now have only one version of:
Java_java_lang_ProcessHandleImpl_00024Info_initIDs unix_fillArgArray unix_uidToUser
and on version of
unix_getChildren
which is shared by Solaris, Linux and AIX and one version of
unix_getStatInfo unix_getCmdlineInfo
which are shared by Solaris and Linux.
Additionally I've added the AIX port following the new schema and I've slightly improved the Solaris port to use the information from psinfo.pr_psargs for reporting at least the first 80 characters of the command line arguments. arg[0] from psinfo.pr_psargs is also used as the "command()" string in the case where "/proc/%d/path/a.out" isn't readable. This helps to report a command string for all the processes we do not own, because in that case, "/proc/%d/path/a.out" isn't readable. By the way, using args[0] as a fall-back for cmd beeing null also helps on MacOS X for processes we don't own. Finally I've also done some test improvements.
My bias was toward omitting information that was incomplete or not usable programatically. If the commandline is truncated then it is pretty useless to a program, the same with the executable command. If the application has to try to guess what's missing and how to use what is present then it should be omitted from the API. If there is a user present then they can use 'ps' or 'tasklist' and interpret the partial information.
But 'ps' is limited to 80 characters for arguments and command line as well on Solaris as far as I know. Moreover, on AIX I haven't found something better than this for the command line at all. Anyway we have to live with the fact that this information is extremely platform dependent. The least common divisor is surely the PID but for this API I prefer to expose "as much as possible" for a given platform.
I suspect that once this API is out there in the world, people will use it for all kind of stuff like monitoring or logging and I'd rather prefer to see a truncated command line in a log file instad of just seeing a plain PID. So usefull/useless depends pretty much on the use case :)
The focus should be on an API that can be used to control/manage a set of applications the run together, usually under the same userid or with root privileges.
Comments on the diffs:
ProcessHandle.java: - The notes should probably be use @implNote (though I disagree with providing truncated information)
Didn't knew that one but I'll use it. Thanks for the hint.
ProcessHandleImpl_unix: - I would not have expected to find the Solaris and AIX implementations of getStatInfo and getCmdlineInfo here; but perhaps it depends on how one considers the lineage of the real unix. Since the more common platforms are Linux, Redhat, etc I would keep those versions in the ProcessHandleImpl_unix common file.
As I wrote in the summary - I don't think the _unix file should contain the implementations of the most prominent Unix platform but rather the implementations shared between most platforms. If a function implementation is only being used on Linux I think it is reasonable to put it into a _linux file.
A while ago we discussed if the 'unix' directory should be named 'unix' or 'posix' or whatsoever and ended up with 'unix'. I think that's a good decision because it honours the fact that there is no single, true, genuine Unix variant. The 'unix' directory is a pragmatic approach to share "common" Unix code with "common" in the sense of "supported by at least two or more Unix variants". At least that was my understanding.
I'll start to merge in your new changes into my schema and post a new webrev once I'm done.
Thanks once again for your comments, Volker
Thanks, Roger
I have tested all my changes on Linux/amd64, Linux/ppc64, MacOS X, Solaris and AIX.
I'd be happy to forward port my changes to the newest head version if you generally agree with my approach (and give me a short time frame where you promise not to do any massive changes :)
What do you think?
Thank you and best regards, Volker
participants (2)
-
Roger Riggs
-
Volker Simonis