Request for review: 8003424 - Enable Class Data Sharing for CompressedOops
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Aug 7 09:20:28 PDT 2013
>> *+ if (PrintCompressedOopsMode || (PrintMiscellaneous && Verbose)) {*
>> *+ gclog_or_tty->print_cr("Narrow klass base: " PTR_FORMAT ", Narrow klass shift: " SIZE_FORMAT,*
>>
>> I see this conditional comes from universe.cpp but I don't see why this should be printed for the OR condition:
>> PrintMiscellaneous && Verbose.
> I removed the second OR condition.
I asked Harold to do that because we print information about coop info with (PrintMiscellaneous && Verbose):
[SafePoint Polling address: 0x00000001106d5000]
Logical CPUs per core: 2
UseSSE=4 UseAVX=1 UseAES=1
Allocation prefetching: PREFETCHNTA at distance 192, 4 lines of 64 bytes
PrefetchCopyIntervalInBytes 576
PrefetchScanIntervalInBytes 576
PrefetchFieldsAhead 1
CPU:total 8 (4 cores per cpu, 2 threads per core) family 6 model 42 stepping 7, cmov, cx8, fxsr, mmx, sse, sse2, sse3,
ssse3, sse4.1, sse4.2, popcnt, avx, aes, ht, tsc, tscinvbit
heap address: 0x00000007ada00000, size: 1318 MB, zero based Compressed Oops
I wanted to avoid specifying additional flag -XX:+PrintCompressedOopsMode :)
Thanks,
Vladimir
On 8/7/13 5:02 AM, harold seigel wrote:
> Hi Coleen,
>
> Thanks for all the comments. Please see my replies inline.
>
> Harold
>
> On 8/5/2013 3:21 PM, Coleen Phillimore wrote:
>>
>> On 08/02/2013 04:31 PM, harold seigel wrote:
>>> Hi,
>>>
>>> Please review this updated webrev for bug 8003424: http://cr.openjdk.java.net/~hseigel/bug_8003424.2/
>>> <http://cr.openjdk.java.net/%7Ehseigel/bug_8003424.2/>
>>>
>>> The updated webrev has a lot of changes from the previous webrev, including the following:
>>>
>>> 1. Class metaspace information is now output when -XX:+PrintCompressedOopsMode is specified.
>>>
>>> 2. decode_klass_not_null(Register dst, Register src) no longer clobbers R12.
>>>
>>
>> If decode_klass_not_null(src, dest) doesn't use r12, isn't the size calculated by
>> *instr_size_for_decode_klass_not_null* now wrong? Or is this size calculation only valid for the case where you only
>> have one register? Or can this size be a max size?
> This size is a max size.
>>
>> http://cr.openjdk.java.net/~hseigel/bug_8003424.2/src/share/vm/memory/heap.cpp.frames.html
>>
>> In this file can you spell out segm_r_aligned to something more descriptive - like reserved_segment_alignment,
>> reserved_segments_size, committed_segments_size.
> This change will be in the next webrev.
>>
>> http://cr.openjdk.java.net/~hseigel/bug_8003424.2/src/share/vm/memory/metaspace.cpp.udiff.html
>>
>> *+ if (PrintCompressedOopsMode || (PrintMiscellaneous && Verbose)) {*
>> *+ gclog_or_tty->print_cr("Narrow klass base: " PTR_FORMAT ", Narrow klass shift: " SIZE_FORMAT,*
>>
>> I see this conditional comes from universe.cpp but I don't see why this should be printed for the OR condition:
>> PrintMiscellaneous && Verbose.
> I removed the second OR condition.
>>
>> 2941 #ifdef _LP64
>> 2942 Universe::set_narrow_klass_base((address)_space_list->current_virtual_space()->bottom()
>>
>> Put a comment why you do this. It's so UseCompressedKlassPointers decoding "works" while you are creating the shared
>> archive. The oops are thrown away right now so it doesn't matter that the encoded value uses a different base during
>> dumping than it will use during UseSharedSpaces. When we share String oops, we'll have to fix this code but I don't
>> know how to fix this yet. Maybe add code that always requires that the CDS archive base is the lower of the two
>> bases. Not a problem with your code.
>>
>> Maybe you should assert that UseCompressedOops and KlassPtrs are set here though.
> I added both a comment and an assert.
>>
>> 3012 if (using_class_space()) {
>> 3013 _class_space_list = new VirtualSpaceList(rs);
>> 3014 }
>>
>> This function initialize_class_space() is always called when you are using_class_space() so why this conditional?
>> Shouldn't this also be moved to 2917 under where it's called (or above) in the ifdef LP64 since it's LP64 only now?
> I changed the conditional to an assert.
>>
>> 2557 if ((mdtype == Metaspace::ClassType) && !Metaspace::using_class_space()) {
>> 2558 return 0;
>> 2559 }
>> One way to shorten this expression that appears a few times, is to add an inlined overloaded
>> using_class_space(MetadataType).
> I'd prefer to leave this code as is. This expression occurs only twice and I think the existing code is more
> descriptive than a new function would be.
>>
>> This looks really good. I've been through it a few times and I don't see any problems, other than these suggestions here.
>>
>> Thanks!
>> Coleen
>>
>>>
>>> 3. Method using_class_vsm() was renamed to using_class_space().
>>>
>>> 4. Type narrowKlass was added and replaces narrowOop, where appropriate.
>>>
>>> 5. The Metaspace:: qualifier was removed, where it was unnecessary.
>>>
>>> 6. Metaspace::initialize_class_space() was made private.
>>>
>>> 7. Method max_heap_for_compressed_oops(), in arguments.cpp, was updated.
>>>
>>> Performance tests were run by Jenny using specjvm2008, specjbb2005, and specjbb2013. The results showed no
>>> significant performance difference in terms of scores and gc behavior.
>>>
>>> Additional testing was done on Solaris Sparc and Linux x86 using SpecJBB2005 with -Xmx34g and
>>> -XX:ObjectAlignmentInBytes=16 & 32.
>>>
>>> Please let me know what additional tests should be run.
>>>
>>> Thanks!
>>> Harold
>>>
>>> On 7/24/2013 4:36 PM, Vladimir Kozlov wrote:
>>>> Hi Harold,
>>>>
>>>> > Usually the narrow_klass_base will be 32G and the
>>>> > LogKlassAlignmentInBytes will be 3 (8 bytes alignment) which means that
>>>> > narrow_klass_base >> 3 will exceed maxint. So, we will need R12, unless
>>>> > we do multiple addq(r, narrow_klass_base_shifted()). Does that make sense?
>>>>
>>>> My bad, I miscalculated. But we have "perfect value" 0x800000000 and I am tempted to use may be bts (bit set) to
>>>> avoid R12 usage (assuming or with check that class metaspace size < 32Gb).
>>>>
>>>> > Is there one prefix byte per instruction, or just for certain instructions?
>>>>
>>>> "Not all instructions require a REX prefix in 64-bit mode. A prefix is necessary only if an instruction references
>>>> one of the extended registers or uses a 64-bit operand."
>>>>
>>>> I want you also run tests with -XX:ObjectAlignmentInBytes=16 and =32 and big java heap. Recently we got several bugs
>>>> which indicated that we did not separate cleanly oops and klass pointers en/decoding.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 7/24/13 11:35 AM, harold seigel wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>> Thank you for the review! Please see inline comments.
>>>>>
>>>>> Thanks, Harold
>>>>>
>>>>> On 7/19/2013 7:56 PM, Vladimir Kozlov wrote:
>>>>>> On 7/18/13 3:03 PM, Vladimir Kozlov wrote:
>>>>>>> Nice clean up, Harold
>>>>>>>
>>>>>>> Could you add the size of class metaspace in output with
>>>>>>> PrintCompressedOopsMode (and with verbose). I don't want to remember an
>>>>>>> other very long flag name: TraceMetavirtualspaceAllocation.
>>>>> Will be done in next webrev.
>>>>>>>
>>>>>>> Also it would be nice to print these info line (and compressed oops info
>>>>>>> line) in hs_err files.
>>>>> I filed an enhancement bug for this: 8021264
>>>>> <https://jbs.oracle.com/bugs/browse/JDK-8021264>.
>>>>>>>
>>>>>>> About "effect(KILL cr); /// ???? is this KILL needed?" in x86_64.ad.
>>>>>>> encode_klass_not_null() and decode_klass_not_null() use arithmetic
>>>>>>> instructions which modify flags: subq, addq, shrq, xorptr. Also note
>>>>>>> that code in .ad file does not check the encoding mode for narrow
>>>>>>> klass/oop so it should be conservative and assume that arithmetic
>>>>>>> instructions are used.
>>>>> OK. Thanks.
>>>>>>>
>>>>>>> If Universe::narrow_klass_shift() == 0 you have klass base below 4Gb so
>>>>>>> the value is 32 bit. You can use it directly in narrow klass
>>>>>>> encoding/decoding without destroying R12.
>>>>>>
>>>>>> Correction. You can do it only if base < 2Gb max positive int (sign
>>>>>> bit is extended so you will get wrong result with address > 2gb).
>>>>> But the base will normally be 32Gb. So this case won't happen very often.
>>>>>>
>>>>>> You definitely should not use R12 in decode_klass_not_null(Register
>>>>>> dst, Register src) method where you have free 'dst' register.
>>>>> Will be fixed in next webrev.
>>>>>>
>>>>>> About CompressedKlassPointersBase 0x800000000 (32*G). Actually it
>>>>>> should depend on ObjectAlignmentInBytes. For =16 it should be 64Gb.
>>>>>> The idea was to avoid using R12 by using shifted base:
>>>>>>
>>>>>> decode:
>>>>>> if (Universe::narrow_klass_base_shifted() != 0) {
>>>>>> if (Universe::set_narrow_klass_shift() == 0) {
>>>>>> shrq(r, LogKlassAlignmentInBytes);
>>>>>> }
>>>>>> addq(r, Universe::narrow_klass_base_shifted());
>>>>>> shlq(r, LogKlassAlignmentInBytes);
>>>>>> }
>>>>>>
>>>>>> Universe::narrow_klass_base_shifted() is set only if
>>>>>> (Universe::narrow_klass_base >> LogKlassAlignmentInBytes) <= maxint
>>>>>> (max positive int).
>>>>> Usually the narrow_klass_base will be 32G and the
>>>>> LogKlassAlignmentInBytes will be 3 (8 bytes alignment) which means that
>>>>> narrow_klass_base >> 3 will exceed maxint. So, we will need R12, unless
>>>>> we do multiple addq(r, narrow_klass_base_shifted()). Does that make sense?
>>>>>>
>>>>>> And you have general memory reservation problem. At least on Solaris,
>>>>>> as I remember, OS will always try to keep protected 4Kb pages between
>>>>>> different requested memory regions. That is why in jdk7 we first
>>>>>> reserve one huge space and then cut it on java heap, perm gen and
>>>>>> protection page below heap to catch implicit NULL exceptions with
>>>>>> compressed oops.
>>>>>>
>>>>>> So, please, verify that you are getting 'cds_address + cds_total'
>>>>>> address or CompressedKlassPointersBase without CDS and with compressed
>>>>>> oops and with Xmx20Gb.
>>>>> I verifed that we get either cds_address+cds_total as the address, or
>>>>> CompressedKlassPointersBase as the address without CDS on Linux, Windows
>>>>> 7, Mac OS, and Solaris 5.11. This is true with default heap sizes and
>>>>> -Xmx32G.
>>>>>
>>>>> On Solaris 5.10, I don't get CompressedKlassPointersBase or the desired
>>>>> CDS address for class metaspace regardless of heap size.
>>>>>>
>>>>>>>
>>>>>>> instr_size_for_decode_klass_not_null() is ugly but unfortunately we
>>>>>>> can't do other way. I assume you use max size because depending on
>>>>>>> registers used in instructions you will or will not get prefix byte
>>>>>>> (r8-r15).
>>>>> Is there one prefix byte per instruction, or just for certain instructions?
>>>>>>>
>>>>>>> I will look on changes in more details may be late.
>>>>>>
>>>>>> What 'vsm' stands for in using_class_vsm()? Don't use abbreviations
>>>>>> which are not obvious.
>>>>> I changed using_class_vsm() to using_class_space().
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 7/18/13 10:54 AM, harold seigel wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Please review this fix for bug 8003424.
>>>>>>>>
>>>>>>>> Open webrev at http://cr.openjdk.java.net/~hseigel/bug_8003424/
>>>>>>>> <http://cr.openjdk.java.net/%7Ehseigel/bug_8003424/>
>>>>>>>> Open bug links at:
>>>>>>>>
>>>>>>>> http://bugs.sun.com/view_bug.do?bug_id=8003424
>>>>>>>> http://bugs.sun.com/view_bug.do?bug_id=8016729
>>>>>>>> http://bugs.sun.com/view_bug.do?bug_id=8011610
>>>>>>>>
>>>>>>>> JBS Bug links at
>>>>>>>>
>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-8003424
>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-8016729
>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-8011610
>>>>>>>>
>>>>>>>>
>>>>>>>> This fix provides support for using compressed klass pointers with CDS.
>>>>>>>> It also changes the class metaspace allocation on 64-bit platforms when
>>>>>>>> UseCompressedKlassPointers is true. Instead of allocating the class
>>>>>>>> metaspace as part of the Java Heap allocation and locating it at the
>>>>>>>> base of that allocation, the metaspace will now be allocated
>>>>>>>> separately,
>>>>>>>> above the Java heap. This will enable future expansion of the
>>>>>>>> metaspace
>>>>>>>> because it won't be backed up against the Java heap. If CDS is being
>>>>>>>> used, then the CDS region will be allocated between the Java heap and
>>>>>>>> the metaspace.
>>>>>>>>
>>>>>>>> The new class metaspace allocation code tries to allocate memory at
>>>>>>>> 32G,
>>>>>>>> or above the CDS region, if it is present. Because of this, encoding
>>>>>>>> and decoding of compressed klass pointers will always require use of a
>>>>>>>> base register. So, encoding and decoding of compressed klass pointers
>>>>>>>> will differ from compressed oops.
>>>>>>>>
>>>>>>>> There are no class metaspace allocation changes if
>>>>>>>> UseCompressedKlassPointers is turned off or if running on a 32-bit
>>>>>>>> platform.
>>>>>>>>
>>>>>>>> The code changes also include some cleanup and will fix bugs 8016729,
>>>>>>>> 8011610, and 8003424.
>>>>>>>>
>>>>>>>>
>>>>>>>> Many of the modules in this webrev contain a lot of changes. Modules
>>>>>>>> macroAssembler_sparc.cpp and macroAssembler_x86.cpp were changed to
>>>>>>>> support the new encoding and decoding of compressed klass pointers.
>>>>>>>>
>>>>>>>> Module metaspace.cpp was changed significantly to support the new
>>>>>>>> allocation for metaspace and the CDS region and to determine compressed
>>>>>>>> klass pointer encoding base and shifting values.
>>>>>>>>
>>>>>>>> Most of the changes to module universe.cpp were to remove code related
>>>>>>>> to allocating metaspace and remove code that considered compressed
>>>>>>>> klass
>>>>>>>> pointers when determining the compressed oops compression mechanism.
>>>>>>>>
>>>>>>>> Modules klass.inline.hpp and oop.inline.hpp were changed as part of a
>>>>>>>> cleanup requested by Coleen.
>>>>>>>>
>>>>>>>>
>>>>>>>> These changes were tested with JCK Lang and VM tests, JTReg tests,
>>>>>>>> JPRT,
>>>>>>>> GCBasher, refworkload, ute vm.quick.testlist and vm.mlvm.testlist
>>>>>>>> tests. Most of the test were run with -Xshare:on and -Xshare:off (with
>>>>>>>> and without CDS), and were run on Solaris Sparc, 32-Bit Linux and
>>>>>>>> 64-Bit
>>>>>>>> Linux. Jtreg tests were run on Windows 7 and Mac OS. JCK tests were
>>>>>>>> run on Solaris x86.
>>>>>>>>
>>>>>>>> The performance impact of these changes is TBD.
>>>>>>>>
>>>>>>>> Thanks, Harold
>>>>>>>>
>>>>>>>>
>>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list