RFR (S/M) expose L1_data_cache_line_size for diagnostic/sanity checks (8049717)
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Jul 11 17:48:03 UTC 2014
Yes, I got 32 from prtpicl on T4.
I googled and T1 had S1 core, T2 and T3 had S2 core, T4,T5,T6 has S3
core, T7 will have S4 core.
Based on your prtpicl data I would assume T3 will have 16 bytes cache
line too as T2.
But it is mess. How come we don't have SPARC documents which clear
states all this parameters :(
I don't want to start about CPUID again.
How is critical for you to have correct size?
Vladimir
On 7/11/14 9:09 AM, Daniel D. Daugherty wrote:
> Vladimir, thanks for the thorough review.
>
>
> On 7/10/14 1:07 PM, Vladimir Kozlov wrote:
>> Hi Dan
>>
>> vm_version_sparc.cpp:
>>
>> I don't know where you get 16 byte cache line size:
>
> That would be covered by the comments:
>
> 263 if (is_sun4v()) {
> 264 assert(_L1_data_cache_line_size == 0, "overlap with sun4v
> family");
> 265 // All Niagara's are sun4v's, but not all sun4v's are Niagaras.
> 266 //
> 267 // Ref: UltraSPARC T1 Supplement to the UltraSPARC
> Architecture 2005
> 268 // Appendix F.1.3.1 Cacheable Accesses
> 269 //
> 270 // Ref: UltraSPARC T2: A Highly-Threaded, Power-Efficient,
> SPARC SOC
> 271 // Section III: SPARC Processor Core
> 272 //
> 273 // Ref: Oracle's SPARC T4-1, SPARC T4-2, SPARC T4-4, and SPARC
> T4-1B Server Architecture
> 274 // Section SPARC T4 Processor Cache Architecture
> 275 _L1_data_cache_line_size = 16;
> 276 }
>
> Unfortunately, I can no longer find the T4 L1 cache line size in
> that last reference. Either I dreamed it or that doc has been
> tweaked since I previously looked at it. I googled around again,
> but I can't find a good reference for the T4 L1 cache line size.
>
>>
>> /usr/sbin/prtpicl -v |grep l1-dcache |more
>> :l1-dcache-line-size 32
>> :l1-dcache-size 16384
>> :l1-dcache-associativity 4
>
> I'm guessing the above is a from a 'T4' or newer machine.
>
> And by the example from these machines:
>
> $ uname -a
> SunOS dr-evil 5.10 Generic_142900-03 sun4v sparc SUNW,Sun-Fire-T1000
>
> $ /usr/sbin/prtpicl -v | grep l1-dcache-line-size | sort -u
> :l1-dcache-line-size 16
>
> $ uname -a
> SunOS mrspock 5.10 Generic_141444-09 sun4v sparc SUNW,T5440
>
> $ /usr/sbin/prtpicl -v | head -1000 | grep l1-dcache-line-size | sort -u
> :l1-dcache-line-size 16
>
> prtpicl seems to go on and on and on on mrspock... hence 'head -1000'
>
> $ uname -a
> SunOS terminus 5.11 11.0 sun4u sparc SUNW,SPARC-Enterprise
>
> $ /usr/sbin/prtpicl -v | grep l1-dcache-line-size | sort -u
> :l1-dcache-line-size 0x40
>
>
>
>
>> It is 32 for T4 and for T7 it will be larger:
>>
>> static intx prefetch_data_size() {
>> return is_T4() && !is_T7() ? 32 : 64; // default prefetch block
>> size on sparc
>> }
>
> OK. So T1 and T2 have 16-byte L1 cache line sizes. Is there a T3?
> T4 and T5 have 32-byte L1 cache lines sizes. Is there a T6?
> T7 and newer have 64-byte cache line sizes.
>
> Can I repeat (from a different e-mail thread) that SPARC really
> needs the equivalent of CPUID?
>
>
>> sun4v could be defined for Fujitsu Sparc64 too:
>>
>> static bool is_niagara(int features) {
>> // 'sun4v_m' may be defined on both Sun/Oracle Sparc CPUs as well as
>> // on Fujitsu Sparc64 CPUs, but only Sun/Oracle Sparcs can be
>> 'niagaras'.
>> return (features & sun4v_m) != 0 && (features & sparc64_family_m)
>> == 0;
>
> So are the three distinct SPARC 64-bit families better stated as
> (where the Niagara family has three different L1 cache line sizes):
>
> is_ultra3() // 64-byte L1 cache line size
> is_niagara()
> is_T7() // 64-byte L1 cache line size
> else is_T4() // 32-byte L1 cache line size
> else /* T[12] */ // 16-byte L1 cache line size
> is_sparc64() // 64-byte L1 cache line size
>
>
>> vm_version_x86.hpp, vm_version_x86.cpp
>>
>> I would like to keep cpuid bit access in .hpp file.
>> I would suggest to keep code prefetch_data_size() but may be rename it
>> as L1_line_size() so that you have in .hpp:
>>
>> static intx L1_line_size() {
>> intx result = 0;
>> if (is_intel()) {
>> result = (_cpuid_info.dcp_cpuid4_ebx.bits.L1_line_size + 1);
>> } else if (is_amd()) {
>> result = _cpuid_info.ext_cpuid5_ecx.bits.L1_line_size;
>> }
>> if (result < 32) // not defined ?
>> result = 32; // 32 bytes by default on x86 and other x64
>> return result;
>> }
>>
>> static intx prefetch_data_size() {
>> return L1_line_size();
>> }
>>
>> and in .cpp for > i486 (i486 code is still yours):
>>
>> _L1_data_cache_line_size = L1_line_size();
>
> Sure, I can move the CPUID bit stuff back into the .hpp file.
> I'll do the rename and make prefetch_data_size() a wrapper
> call to L1_line_size().
>
>
>> objectMonitor.cpp and synchronizer.cpp:
>>
>> cast to 'int' but destination is 'unsigned' (also you can use 'uint'):
>>
>> unsigned int offset_stwRandom = (int)
>
> I'll check that out.
>
>
>> combine two 'if (verbose)' into one.
>
> I'll check that out also.
>
> Dan
>
>
>>
>> On 7/9/14 9:42 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have the fix for the following bug ready for JDK9 RT_Baseline:
>>>
>>> JDK-8049717 expose L1_data_cache_line_size for diagnostic/sanity
>>> checks
>>> https://bugs.openjdk.java.net/browse/JDK-8049717
>>>
>>> Here is the URL for the webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8049717-webrev/0-jdk9-hs-rt/
>>>
>>> This fix is a standalone piece from my Contended Locking reorder
>>> and cache-line bucket. I've split it off as an independent bug fix
>>> in order to make the reorder and cache-line bucket more clear.
>>>
>>> Testing:
>>>
>>> - JPRT test jobs
>>> - manual testing of the new output via existing options:
>>> -XX:+UnlockExperimentalVMOptions -XX:SyncKnobs=Verbose=1
>>> -XX:+ExecuteInternalVMTests -XX:+VerboseInternalVMTests
>>> - Aurora Adhoc nsk.sajdi and vm.parallel_class_loading as part of
>>> testing for my Contended Locking reorder and cache-line bucket
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>
More information about the hotspot-runtime-dev
mailing list