RFR: 8224816: Provide os::processor_id() implementation for Mac OS

Wed May 29 13:25:44 UTC 2019

Hi Gerard,

On 2019-05-28 19:57, gerard ziemski wrote:
> hi Erik,
>
> Thanks for doing this, it's more complicated than I had imagined it.
>
> I think we can remove line 3286, as level_max is not used anywhere (no 
> need for a new webrev):
>
> 3286       uint level_max = ebx;

Good catch, thanks.

> Question: is there an advantage to using:
> asm ("cpuid\n\t" : "+a" (eax), "+b" (ebx), "+c" (ecx), "+d" (edx) : );
> over:
>
> __cpuid(1, eax, ebx, ecx, edx);
>
> or is this just a personal preference?

I didn't think the __cpuid function really abstracts away anything from 
the inline assembly. If anything it confused me for two reasons:
1) __cpuid() allows setting the "leaf" parameter, which sets rax to that 
value... makes sense. But I need to set rcx also. Apparently then you 
could use __cpuid_count() which has an additional "count" parameter that 
seems to be placed in rcx, but it's kind of not obvious as a reader that 
is what will happen... I only know that through introspection, and 
making it explicit with inline assembly removes any question marks there.
2) The __cpuid() fascilities on x86_64 contain some code to preserve rbx 
(which is callee saved) across the cpuid call. But both GCC and clang 
inline assembly automatically preserve explicitly clobbered callee saved 
registers across inline assembly blocks (have verified that)... anything 
else would be a bug in the register allocator for inline assembly blocks.

So in the end I guess I found myself with many questions regarding the 
use of __cpuid() and __cpuid_count(), and it seemed more straight 
forward to just stick in cpuid with inline assembly. It also makes the 
code directly portable to other platforms with awkward OS support for 
cpuid... which I have heard might affect other platforms as well.

Thanks for the review.

/Erik

> cheers
>
>
>
> On 5/28/19 3:34 AM, Erik Österlund wrote:
>> Hi Per,
>>
>> Thanks for reviewing.
>>
>> On 2019-05-28 09:46, Per Liden wrote:
>>> Hi Erik,
>>>
>>> On 5/27/19 12:05 PM, Erik Österlund wrote:
>>>> Hi,
>>>>
>>>> In order to port ZGC to Mac OS, we need to implement 
>>>> os::processor_id(), due to the use of CPU locals in ZGC.
>>>> I built an implementation that uses CPUID to get the APIC id, and 
>>>> from that determine a unique CPU id.
>>>> Note though that the APIC ids are split over a potentially larger 
>>>> number space than the number of processors.
>>>> In order to reliably translate APIC ids to CPU ids, I first use 
>>>> CPUID to query how many bits are being used
>>>> for APIC ids, and then create an array of integers covering that 
>>>> space. The array maps possible APIC id positions
>>>> to a unique processor id. The array is lazily populated with APIC 
>>>> id -> processor id mappings.
>>>> The specific CPUID leaf used to query this topology information is 
>>>> intel-specific, but that should be okay as all macs are Intel based.
>>>> I have tried this out on a number of macs with a mac port of ZGC 
>>>> (to be upstreamed soon), and it worked well.
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8224816
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8224816/webrev.00/
>>>
>>> Just a few comments:
>>>
>>> * Maybe replace "cpu" with "processor" in variable names, since the 
>>> function is named processor_id().
>>
>> Fixed.
>>
>>> * Could we split out the setup of the array into a separate function 
>>> helper, like:
>>>
>>>   volatile int* mapping = get_apic_to_processor_mapping();
>>
>> Fixed.
>>
>>> * Move the declaration of "apic_to_cpu_mapping" and "next_cpu_id" 
>>> out of the function, to avoid the pthread_once initialization checks 
>>> every call?
>>
>> Fixed.
>>
>> New webrev:
>> http://cr.openjdk.java.net/~eosterlund/8224816/webrev.01/
>>
>> Incremental:
>> http://cr.openjdk.java.net/~eosterlund/8224816/webrev.00_01/
>