RFR: 8203227: Introduce os::processor_id() for Linux and Solaris

Wed May 16 12:54:26 UTC 2018

Hi David,

On 05/16/2018 02:16 PM, David Holmes wrote:
[...]
>>>> We're using this in ZGC to do various CPU-local operations. We 
>>>> currently
>>>
>>> So there's a requirement that the CPU value returned by 
>>> processor_id() is compatible with some other API's that take a 
>>> processor id? Does that need documenting in some form?
>>
>> Good point, this should be made more clear. We currently assume that 
>> os::processor_id() returns a value from 0 to (os::processor_count() - 
>> 1). Adding a comment in os.hpp clarifying that.
> 
> Thanks. Not every value in that range may be a valid processor id, but 
> every valid processor id should be within that range.

Yep.

> 
>>>> only need this for Linux and Solaris, which is what this patch adds. 
>>>
>>> Are the API's that take the processor id defined in os as well? 
>>> Otherwise it seems this functionality should just be localized to the 
>>> specific os_<os>.cpp file from which it is used (and/or the os::<OS> 
>>> class).
>>
>> The os::processor_id() call-sites we have are in shared os-agnostic 
>> code. This function is also fairly closely related to 
>> os::processor_count(), so it seems like os:: would be the natural 
>> place for this.
> 
> I'm curious as to what functions in os-agnostic code consume the 
> processor id value? Or are you storing/caching it in the shared code and 
> only using it later from OS specific code?

The typical way we use this in ZGC is to have CPU-local data, 
conceptually equivalent to this toy example:

// Allocate per-CPU data
MyData* my_per_cpu_data = alloc(sizeof(MyData) * os::processor_count());

// Read CPU-local data
... = my_per_cpu_data[os::processor_id()];

Have a look at the ZPerCPU template class if you're interested in more 
details: 
http://hg.openjdk.java.net/zgc/zgc/file/118cb3e3517d/src/hotspot/share/gc/z/zValue.hpp

This template basically allows us to just say:

// Allocate per-CPU data
ZPerCPU<MyData> my_per_cpu_data;

// Read CPU-local data
... = my_per_cpu_data.get();

It should be noted that ZGC is only ever built (controlled by sh 
configure) for supported platforms (Linux & Solaris at this moment), 
which is why it's fine to have os::processor_id() calls in this shared code.

> 
>>>
>>>> Support for additional platforms can be added when needed. I didn't 
>>>> add dummy os::processor_id() to the not-yet-implemented platforms, 
>>>> since I think it's better to get a early compile-time error instead 
>>>> of a Unimplemented() at runtime, if you try to use them.
>>>
>>> I'd rather not see the runtime support of our primary platforms of 
>>> interest fragmented like this. Are there technical challenges to 
>>> doing this?
>>
>> I'm a bit reluctant to add this for other platforms at this point, 
>> since some platforms might require non-trivial implementations, which 
>> would go unused/untested. For macOS I believe we would need something 
>> like this:
>>
>> http://cr.openjdk.java.net/~gziemski/zgc_os_processor_id/webrev/src/hotspot/os/bsd/os_bsd.cpp.udiff.html 
> 
> 
> Yeah direct use of cpuid is not pleasant.
> 
>> (I should add that I don't know if this patch does the right thing, 
>> but Gerard apparently had some success with it)
>>
>> For Windows, I believe a call to GetCurrentProcessorNumber() should be 
>> enough (we don't seem to handle/support logical processor groups in 
>> os::processor_count() so should be safe to do the same thing here).
> 
> The current processor should always be correct regardless of what 
> logical/virtual processor configuration exists.

Check.

> 
>> The plan has been to add support for these if/when ZGC supports these 
>> platforms (unless someone else needs and adds them before that). But 
>> adding them now makes me a bit uneasy, especially given the 
>> non-trivial nature of the macOS implementation.
> 
> Okay. I'd prefer to see all platforms covered, but not with the risk of 
> untested code being used.
> 
>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203227
>>>> Webrev: http://cr.openjdk.java.net/~pliden/8203227/webrev.0
>>>
>>> For Solaris there is no need to introduce Solaris::getcpuid() just 
>>> because Linux already has that. (Unless this ends up not being needed 
>>> in the os class of course :) ).
>>
>> Check.
>>
>>>
>>> Are you allowing for the -1 return value on Linux or assuming it will 
>>> never happen in practice? If the former then that should be 
>>> documented as part of the semantics of os::processor_id(). If the 
>>> latter then you should probably add an assert or guarantee. I suspect 
>>> this is a historical restriction no longer relevant to us - in which 
>>> case we can probably clean out the dynamic lookup/linking!).
>>
>> We're assuming it will never happen in practice, and this is currently 
>> guarded with a guarantee() during initialization of ZGC. 
>> sched_getcpu() has "only" been around since glibc 2.14 (June 2011). 
>> However, we can also receive -1 if the kernel doesn't have the 
>> getcpu() syscall (first introduced in Linux 2.6.19, Nov 2006).
>>
>> So, it seems safe to at least assume that the getcpu() syscall is 
>> available, and only have an single check for that during startup (with 
>> a vm_exit_during_initialization() if it's unavailable).
> 
> Okay. We should clean up the syscall and dynamic lookup code use here.
> 
>> Here's an updated webrev for continued discussion:
>>
>> http://cr.openjdk.java.net/~pliden/8203227/webrev.1
> 
> Looks fine to me.

Ok, thanks David!

/Per

> 
> Thanks,
> David
> 
>> cheers,
>> Per
>>
>>>
>>> Thanks,
>>> David
>>>
>>>> /Per