RFR 9: 8078099 : (process) ProcessHandle should uniquely identify processes

Tue Jul 7 13:11:00 UTC 2015

Hi Volker,

Sure, let me know what is different about AIX and how it should/can be 
fixed.

Thanks, Roger

On 7/7/2015 3:29 AM, Volker Simonis wrote:
> Hi Roger,
>
> sorry but I only now noticed this RFR. Can you please temporary hold
> this change back. I think it needs some adjustments on AIX. I'll look
> into it today and let you know.
>
> Thanks,
> Volker
>
>
>
> On Mon, Jul 6, 2015 at 4:47 PM, Roger Riggs <Roger.Riggs at oracle.com> wrote:
>> Hi Peter,
>>
>> Thanks for reviewing the new implementation.
>>
>> The idea to pre-allocate the buffer based on the size of the output arrays
>> passed for output
>> will not work as desired.
>> The same native code entry point is used whether it is a query for all
>> processes or
>> just the immediate children.  In either case, the buffer for sysctl needs to
>> be
>> large enough to accommodate all of the OS threads; its not a function of the
>> output array size.
>>
>> With respect to the buffer allocation on OS X, I might see the point of a
>> retry
>> when sysctl returns ENOMEM;
>>
>>
>> I don't think its particularly worthwhile to add the native code to retry in
>> the case
>> of ENOMEM. The OSX doc for sysctrl does indicate it tries to round up
>> to avoid a subsequent failure.
>> The ProcessHandle API already cautions that the list of processes is dynamic
>> and that processes may started or terminate concurrently with the call  to
>> children/allChildren.
>> So if the sysctl returns ENOMEM indicating not all the processes fit; the
>> ones
>> that do fit within in the allocated buffer are valid at that point in time.
>> If there are extras that do not fit in the buffer they are likely to be
>> 'newer'
>> processes and can be considered to have been started 'after' the
>> children/allChildren invocation.
>> So I propose to change the error checking to consider ENOMEM as a
>> non-exceptional
>> return and return the processes available.
>>
>> Thanks, Roger
>>
>>
>>
>>
>> On 7/2/2015 3:32 AM, Peter Levart wrote:
>>> Hi Roger,
>>>
>>> I looked at the code briefly and have the following comments:
>>>
>>> For ProcessHandleImpl_macosx.c, in method getProcessPids0:
>>>
>>>   224     // Get buffer size needed to read all processes
>>>   225     int mib[4] = {CTL_KERN, KERN_PROC, KERN_PROC_ALL, 0};
>>>   226     if (sysctl(mib, 4, NULL, &bufSize, NULL, 0) < 0) {
>>>   227         JNU_ThrowByNameWithLastError(env,
>>>   228             "java/lang/RuntimeException", "sysctl failed");
>>>   229         return -1;
>>>   230     }
>>>   231
>>>   232     // Allocate buffer big enough for all processes
>>>   233     void *buffer = malloc(bufSize);
>>>   234     if (buffer == NULL) {
>>>   235         JNU_ThrowOutOfMemoryError(env, "malloc failed");
>>>   236         return -1;
>>>   237     }
>>>   238
>>>   239     // Read process info for all processes
>>>   240     if (sysctl(mib, 4, buffer, &bufSize, NULL, 0) < 0) {
>>>   241         JNU_ThrowByNameWithLastError(env,
>>>   242             "java/lang/RuntimeException", "sysctl failed");
>>>   243         free(buffer);
>>>   244         return -1;
>>>   245     }
>>>
>>> ... the 1st call to sysctl is used to measure the size of the needed
>>> buffer and the 2nd call fills-in the buffer. The documentation for sysctl
>>> says:
>>>
>>> "     The information is copied into the buffer specified by oldp. The
>>> size of the buffer is given by the
>>>       location specified by oldlenp before the call, and that location
>>> gives the amount of data copied after
>>>       a successful call and after a call that returns with the error code
>>> ENOMEM.  If the amount of data
>>>       available is greater than the size of the buffer supplied, the call
>>> supplies as much data as fits in
>>>       the buffer provided and returns with the error code ENOMEM. If the
>>> old value is not desired, oldp and
>>>       oldlenp should be set to NULL.
>>>
>>>       The size of the available data can be determined by calling sysctl()
>>> with the NULL argument for oldp.
>>>       The size of the available data will be returned in the location
>>> pointed to by oldlenp.  For some opera-tions, operations,
>>>       tions, the amount of space may change often.  For these operations,
>>> the system attempts to round up so
>>>       that the returned size is large enough for a call to return the data
>>> shortly thereafter."
>>>
>>> So while not very probable, it can happen that you get ENOMEM from 2nd
>>> call because of underestimated buffer size. Would it be better to retry
>>> (re)allocation of buffer and 2nd call in a loop with new estimation returned
>>> from previous call while the error is ENOMEM?
>>>
>>> Another suggestion: What would it be if the buffer size estimation was not
>>> computed by a call to sysctl with NULL buffer, but was taken from the
>>> passed-in resulting array size(s). In case the user passes-in arrays of
>>> sufficient size, you can avoid double invocation of sysctl. Also, if ENOMEM
>>> happens, you can just return the result obtained and the new estimation - no
>>> looping in native code. The UNIX, Solaris and Windows variants look good.
>>>
>>> Regards, Peter
>>>
>>> On 06/22/2015 05:10 PM, Roger Riggs wrote:
>>>> Please review changes to ProcessHandle implementation to uniquely
>>>> identify
>>>> processes based on the start time provided by the OS. It addresses the
>>>> issue
>>>> of PID reuse.
>>>>
>>>> This is the implementation of the ProcessHandle.equals() spec change in
>>>>      8129344 : (process) ProcessHandle instances should define equals and
>>>> be value-based
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~rriggs/webrev-starttime-8078099/
>>>>
>>>> Issue:
>>>> https://bugs.openjdk.java.net/browse/JDK-8078099
>>>>
>>>> Thanks, Roger
>>>>