RFR: 6515172: Runtime.availableProcessors() ignores Linux taskset command
Vitaly Davidovich
vitalyd at gmail.com
Fri Jan 22 12:58:15 UTC 2016
On Friday, January 22, 2016, David Holmes <david.holmes at oracle.com> wrote:
> On 22/01/2016 10:38 PM, Vitaly Davidovich wrote:
>
>> I don't think current thread represents current process, it represents
>> the current kernel thread :). You can launch Java with taskset and then
>> change one of the thread's affinity masks at any time via
>> sched_setaffinity, including increasing number of usable CPUs (if
>> nothing else constrains it). If you want process mask you need to pass
>> it the pid, but that may be meaningless if other threads were later
>> affinitized differently.
>>
>
> I'm not trying to account for separate threads being individually modified
> via sched_setaffinity. If you use native code to do that then you are on
> your own. There is no way for an API like Runtime.availableProcesors to
> behave sensibly in that scenario.
Right, agreed.
>
> If you taskset a running process then it is supposed to constrain all
> threads of the process.
>
> But I'm not even trying to address the situation where dynamic changes
> occur (though I did expect it to work). The main use case to address is
> starting the JVM within a taskset/cgroup and have it report the correct
> number of available processors.
Ok. I agree that 0 vs pid shouldn't matter there as all threads will
inherit the mask. But, I don't see the harm in passing getpid() to be more
explicit; extra syscall but this shouldn't be a hot function in normal
applications.
>
> Also are we sure it's a good idea to change this API behavior?
>>
>
> Most people expect it to return the actual number of processors available
> to the JVM - which in normal execution it does. But once tasksets/cgroups
> come into play it fails to do that. The Solaris version was changed years
> ago to address the same issue with processor sets - linux wasn't modified
> then because the mechanism and APIs on linux hadn't stabilized and no-one
> was asking for it.
Yes but I think people using this API already know this limitation and may
be relying on it returning number of online CPUs. I didn't know Solaris
takes processor sets into account. At any rate, I think your proposal
makes sense but I'd worry it may break someone. New method would be safer
and then you can actually query both bits of info using JDK code, # of
online CPUs and proc affinity mask.
$.02
>
> Thanks,
> David
>
> On Friday, January 22, 2016, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>> Hi Thomas,
>>
>> On 22/01/2016 7:29 PM, Thomas Stüfe wrote:
>>
>>
>> On Fri, Jan 22, 2016 at 10:21 AM, Thomas Stüfe
>> <thomas.stuefe at gmail.com
>> <mailto:thomas.stuefe at gmail.com>> wrote:
>>
>> Hi David,
>>
>> I may be doing this wrong, but I do not get this to work for
>> me
>> (ubuntu 14.4).
>>
>> I built hs-rt with your patch. I am running a simple loop
>> with
>> Runtime.availableProcessors(). I change affinity with
>> taskset and
>> would expect output to change, but nothing changes.
>>
>> java command line: ../images/jdk/bin/java -Xlog:os=trace test
>> taskset command: taskset -p 0x1 <pid>
>>
>> Output:
>>
>> proc: 8
>> [73,525s][trace ][os] active_processor_count: using static
>> path -
>> configured processors: 8
>> [73,525s][trace ][os] active_processor_count:
>> sched_getaffinity
>> processor count: 8
>> proc: 8
>> ...
>>
>> Kind Regards, Thomas
>>
>>
>>
>> I found that if I change:
>>
>> if (sched_getaffinity(0, cpus_size, cpus_p) == 0) {
>>
>> to
>>
>> if (sched_getaffinity(getpid(), cpus_size, cpus_p) == 0) {
>>
>> it works.
>>
>> Manpage on sched_getaffinity() states that "If pid is zero, then
>> the
>> mask of the calling process is returned." but seems that does
>> not work.
>>
>>
>> Later manpages changed that to be "zero means the current thread" -
>> see
>> the online manpage
>>
>> http://man7.org/linux/man-pages/man2/sched_getaffinity.2.html
>>
>> sched_getaffinity is really a thread-API that takes a pid_t - which
>> is confusing to say the least. Regardless if you change the affinity
>> of the process using taskset then it should affect all threads in
>> the process, so this seems to be a glibc and/or kernel bug :(
>>
>> I will see if I can get a C program that shows which threads in a
>> process see the updated mask. This is yet another complexity to
>> something that should be very simple, that I could really do without!
>>
>> Many thanks for the report.
>>
>> David
>> -----
>>
>> Kind Regards, Thomas
>>
>> ps. I am happy you introduced the "os" log tag, now I can use
>> this for
>> AIX-specific logging without having to change shared files :-)
>>
>>
>>
>>
>> On Fri, Jan 22, 2016 at 9:06 AM, David Holmes
>> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
>> wrote:
>>
>> First a special thanks to Martin Buchholz for his input,
>> feedback, critique and raising awareness of how
>> non-simple this
>> issue is.
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-6515172
>> webrev:
>> http://cr.openjdk.java.net/~dholmes/6515172/webrev/
>>
>> Basic problem:
>> processors available for use <= processors online <=
>> processors configured
>>
>> but we always returned the number of online processors.
>>
>> Solution is simple in its basic form: use
>> sched_getaffinity to
>> get the scheduling affinity mask and count the number of
>> available processors.
>>
>> Details are complicated by the desire to handle very
>> large
>> processor systems. See the bug report for lots of
>> detailed
>> discussions and references.
>>
>> Testing:
>> - new test that verifies behaviour when running under
>> taskset
>> - diagnostic hook injection (UseNewCodeN) to enable
>> testing of
>> all code paths (one hook is left in for non-product to
>> allow
>> easy testing of the dynamic path)
>> - JPRT
>>
>> Compatability issues:
>> - the system code we're using now is at least 5 years
>> old so
>> distro's older than that (which are not officially
>> supported)
>> may not work
>> - anyone already running under a processor constrained
>> environment (like Docker) and using availableProcessor()
>> to
>> "size" things, will find that size has now changed. We
>> do not
>> expect this to be a problem - on the contrary we expect
>> Docker
>> users to want the new behaviour.
>>
>> Thanks,
>> David
>>
>>
>>
>>
>>
>> --
>> Sent from my phone
>>
>
--
Sent from my phone
More information about the hotspot-runtime-dev
mailing list