RFR: 8203227: Introduce os::processor_id() for Linux and Solaris

Wed May 16 12:16:04 UTC 2018

Hi Per,

On 16/05/2018 8:55 PM, Per Liden wrote:
> Hi David,
> 
> Thanks for looking at this.
> 
> On 05/16/2018 04:24 AM, David Holmes wrote:
>> Hi Per,
>>
>> On 16/05/2018 1:07 AM, Per Liden wrote:
>>> This patch introduces os::processor_id() to get the id of the CPU 
>>> that the caller is currently executing on. Obviously, the returned id 
>>> should only be viewed as a strong hint, since the information might 
>>> be instantly out-of-date.
>>
>> Or even wrong to begin with:
>>
>> https://stackoverflow.com/questions/36934736/is-sched-getcpu-reliable-on-linux 
>>
>>
>> caveat-emptor! :)
>>
>>> We're using this in ZGC to do various CPU-local operations. We currently
>>
>> So there's a requirement that the CPU value returned by processor_id() 
>> is compatible with some other API's that take a processor id? Does 
>> that need documenting in some form?
> 
> Good point, this should be made more clear. We currently assume that 
> os::processor_id() returns a value from 0 to (os::processor_count() - 
> 1). Adding a comment in os.hpp clarifying that.

Thanks. Not every value in that range may be a valid processor id, but 
every valid processor id should be within that range.

>>> only need this for Linux and Solaris, which is what this patch adds. 
>>
>> Are the API's that take the processor id defined in os as well? 
>> Otherwise it seems this functionality should just be localized to the 
>> specific os_<os>.cpp file from which it is used (and/or the os::<OS> 
>> class).
> 
> The os::processor_id() call-sites we have are in shared os-agnostic 
> code. This function is also fairly closely related to 
> os::processor_count(), so it seems like os:: would be the natural place 
> for this.

I'm curious as to what functions in os-agnostic code consume the 
processor id value? Or are you storing/caching it in the shared code and 
only using it later from OS specific code?

>>
>>> Support for additional platforms can be added when needed. I didn't 
>>> add dummy os::processor_id() to the not-yet-implemented platforms, 
>>> since I think it's better to get a early compile-time error instead 
>>> of a Unimplemented() at runtime, if you try to use them.
>>
>> I'd rather not see the runtime support of our primary platforms of 
>> interest fragmented like this. Are there technical challenges to doing 
>> this?
> 
> I'm a bit reluctant to add this for other platforms at this point, since 
> some platforms might require non-trivial implementations, which would go 
> unused/untested. For macOS I believe we would need something like this:
> 
> http://cr.openjdk.java.net/~gziemski/zgc_os_processor_id/webrev/src/hotspot/os/bsd/os_bsd.cpp.udiff.html 

Yeah direct use of cpuid is not pleasant.

> (I should add that I don't know if this patch does the right thing, but 
> Gerard apparently had some success with it)
> 
> For Windows, I believe a call to GetCurrentProcessorNumber() should be 
> enough (we don't seem to handle/support logical processor groups in 
> os::processor_count() so should be safe to do the same thing here).

The current processor should always be correct regardless of what 
logical/virtual processor configuration exists.

> The plan has been to add support for these if/when ZGC supports these 
> platforms (unless someone else needs and adds them before that). But 
> adding them now makes me a bit uneasy, especially given the non-trivial 
> nature of the macOS implementation.

Okay. I'd prefer to see all platforms covered, but not with the risk of 
untested code being used.

>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203227
>>> Webrev: http://cr.openjdk.java.net/~pliden/8203227/webrev.0
>>
>> For Solaris there is no need to introduce Solaris::getcpuid() just 
>> because Linux already has that. (Unless this ends up not being needed 
>> in the os class of course :) ).
> 
> Check.
> 
>>
>> Are you allowing for the -1 return value on Linux or assuming it will 
>> never happen in practice? If the former then that should be documented 
>> as part of the semantics of os::processor_id(). If the latter then you 
>> should probably add an assert or guarantee. I suspect this is a 
>> historical restriction no longer relevant to us - in which case we can 
>> probably clean out the dynamic lookup/linking!).
> 
> We're assuming it will never happen in practice, and this is currently 
> guarded with a guarantee() during initialization of ZGC. sched_getcpu() 
> has "only" been around since glibc 2.14 (June 2011). However, we can 
> also receive -1 if the kernel doesn't have the getcpu() syscall (first 
> introduced in Linux 2.6.19, Nov 2006).
> 
> So, it seems safe to at least assume that the getcpu() syscall is 
> available, and only have an single check for that during startup (with a 
> vm_exit_during_initialization() if it's unavailable).

Okay. We should clean up the syscall and dynamic lookup code use here.

> Here's an updated webrev for continued discussion:
> 
> http://cr.openjdk.java.net/~pliden/8203227/webrev.1

Looks fine to me.

Thanks,
David

> cheers,
> Per
> 
>>
>> Thanks,
>> David
>>
>>> /Per