RFR 9: 8077350 Process API Updates Implementation Review

Thomas Stüfe thomas.stuefe at gmail.com
Sat Apr 18 18:44:49 UTC 2015


Hi Roger,

On Fri, Apr 17, 2015 at 8:57 PM, Roger Riggs <Roger.Riggs at oracle.com> wrote:

> Hi David,
>
> On 4/17/2015 2:44 PM, David M. Lloyd wrote:
>
>> On 04/17/2015 11:53 AM, Roger Riggs wrote:
>>
>>> Hi Peter,
>>>
>>> On 4/17/2015 4:05 AM, Peter Levart wrote:
>>>
>>>> Hi Roger,
>>>>
>>>> Retrieving and caching the process start time as soon as ProcessHandle
>>>> is instantiated might be a good idea. "destroy" native code would then
>>>> use pid *and* start time to check the identity of the process before
>>>> killing it.
>>>>
>>> Yes, though it does raise a question about how to specify PH.equals().
>>> Either the start time is explicitly
>>> mentioned (and is repeatable and testable) or it is vague and refers to
>>> some implementation
>>> specific value that uniquely identifies the process; and is less
>>> testable.
>>>
>>
>> Even with timestamps though, you cannot prevent the PID from being reused
>> in the time window after you verify its timestamp but before you kill it
>> unless it is a direct child process (on UNIX anyway; who knows what Windows
>> does...).  I think that this inevitable race should be mentioned in the
>> docs regardless of whether the timestamp thing is implemented (or doc'd) to
>> prevent unrealistic expectations (the api draft link seems to be dead so I
>> didn't see if it already includes this kind of language).
>>
> I can add a note about race conditions to the existing class level
> javadoc about process information changing asynchronously.
>
> "Race conditions can exist between checking the status of a process and
> acting upon it."
>
> But I'm still struck by the oddity of Java needing to provide a better
> interface
> to processes than the native OS does.  In the native OS, a pid is a pid and
> the only handle given to applications.  So both the app and the os need to
> be conservative about pid re-use.
>
>
The problem is the audience. Every UNIX system programmer knows about the
dangers of pid recycling. Java programmers don't, and your API promises a
robustness and simplicity which it unfortunately cannot deliver. No one
will expect a ProcessHandle returned by allChildren suddenly to refer to a
totally different process. Therefore I also think the dangers should be
clearly mentioned in the java doc.

Regards, Thomas



> The link[1] was broken while I replaced it with an update reflecting the
> recent comments.
>
> Thanks, Roger
>
> [1] http://cr.openjdk.java.net/~rriggs/phdoc/
>
>
>
>>  At least on Linux (/proc/<pid>/stat) and Solaris (/proc/<pid>/status)
>>>> it is not necessary to open those special files and read them. Just
>>>> doing stat() on them and using the st_mtime will get you the process
>>>> start time. I see AIX shares native code with Linux (unix), so perhaps
>>>> AIX does the same. Mac OSX and Windows have special calls...
>>>>
>>> That is less expensive.  But opening /proc/<pid>/stat is necessary for
>>> allChildren to get
>>> and filter on the parent and can provide the starttime as well.
>>> stat would be useful for allProcesses which does not need the parent.
>>>
>>>>
>>>> In case OS does not allow retrieving the start time (not owner or
>>>> similar), it should be cached as some "undefined" value and treated
>>>> the same when destroying. If while obtaining the ProcessHandle *and*
>>>> while destroying the process, the start time of the process with a
>>>> particular pid is "undefined" then there are two options:
>>>>
>>>> 1 - don't kill the process
>>>> 2 - kill the process
>>>>
>>>> They might actually be no different as the reason of not being able to
>>>> retrieve the start time (not owner) might prevent the process from
>>>> being killed too, so option 2 could be used to allow killing on any
>>>> hypothetical platforms that don't support retrieving start time and it
>>>> is no worse than current implementation anyway.
>>>>
>>> Destroy is a best-effort method;  it is not guaranteed to result in
>>> termination, though
>>> the docs are a bit weak on this point.
>>>
>>> I'd go for option 2, anyone using the API is looking for results on
>>> processes they
>>> already know something about (mostly), so there's no reason to be
>>> particularly
>>> conservative about.  If the caller is trying to be more conservative,
>>> they can maintain
>>> their own extra information about the processes they are managing.
>>>
>>>
>>>
>>>> What do you think?
>>>>
>>> I'd like to pick this up as a separate change, and get the current one
>>> in first.
>>>
>>> Roger
>>>
>>>
>>>
>>
>



More information about the core-libs-dev mailing list