RFR(S): 8005885: enhance PrintCodeCache to print more data
Albert Noll
albert.noll at oracle.com
Tue Sep 3 04:10:12 PDT 2013
Hi all,
thanks again for the feedback.
Here is the new webrev:
http://cr.openjdk.java.net/~anoll/8005885/webrev.02/
<http://cr.openjdk.java.net/%7Eanoll/8005885/webrev.02/>
The changes are as follows:
@Vladimir:
I changed the parameters to size_t, since using a signed variable for a
value that cannot become
negative does not make sense. Also, the interface to the code cache uses
size_t. E.g., capacity(), max_capacity(), etc.
However, I did not consider the atomicity of increment, so I changed the
types back to int.
I now use the standard output of CodeBlob_sizes, so the output of a
detailed print of the code cache
looks as follows:
Code cache entries: (total of #2265)
nmethods: 1797# 76733 kB
Java 1723# 6658 kB (hdr 7%, loc 6%, code 52%,
stub 4%, [oops 0%, data 0%, pcs 9%])
Tier 1 319# 283 kB (hdr 32%, loc 5%, code 29%,
stub 20%, [oops 0%, data 0%, pcs 1%])
Tier 2 79# 304 kB (hdr 7%, loc 8%, code 52%,
stub 6%, [oops 0%, data 0%, pcs 8%])
Tier 3 993# 4697 kB (hdr 6%, loc 7%, code 58%,
stub 5%, [oops 0%, data 0%, pcs 8%])
Tier 4 332# 1371 kB (hdr 6%, loc 3%, code 38%,
stub 1%, [oops 0%, data 1%, pcs 16%])
Native 74# 75 kB (hdr 28%, loc 10%, code 60%,
stub 0%, [oops 0%, data 0%, pcs 0%])
runtime stubs 59# 44 kB (hdr 9%, loc 6%, code 83%,
stub 0%, [oops 0%, data 0%, pcs 0%])
adapters 391# 302 kB (hdr 8%, loc 8%, code 81%,
stub 0%, [oops 0%, data 0%, pcs 0%])
buffer blobs 12# 1469 kB (hdr 0%, loc 0%, code 99%,
stub 0%, [oops 0%, data 0%, pcs 0%])
deoptimization stubs 1# 1 kB (hdr 5%, loc 2%, code 89%,
stub 0%, [oops 0%, data 0%, pcs 0%])
uncommon trap stubs 1# 0 kB (hdr 12%, loc 1%, code 83%,
stub 0%, [oops 0%, data 0%, pcs 0%])
others 4# 2 kB (hdr 12%, loc 1%, code 84%,
stub 0%, [oops 0%, data 0%, pcs 0%])
nmethod state distribution
in-use 1705# 6238 kB (hdr 7%, loc 6%, code 52%,
stub 4%, [oops 0%, data 0%, pcs 9%])
not-entrant 29# 191 kB (hdr 4%, loc 5%, code 56%,
stub 2%, [oops 0%, data 0%, pcs 11%])
zombie 63# 303 kB (hdr 5%, loc 6%, code 53%,
stub 4%, [oops 0%, data 0%, pcs 10%])
unloaded 0#
Do you think that is OK, or is that too much information?
@Christian, @Vladimir
I removed PrintCodeCache2 (+Verbose option) and instead introduced
-XX:PrintCodeCacheDetails=option where option can be:
"off: print no additional code cache information"
"on: print detailed code cache information when exiting"
"content: print nmethod size distribution and names when
exiting"
"oop_map: print oop map usage when exiting"
"dep_check_time: print nmethod dependency checking
information when exiting"
"trace: print code cache actions during execution"
"all: print all of the above")
PrintCodeCacheDetails is a 'develop' flag.
@Chris P.
If you think we should add more buffer blob information, I can add that.
We could use
-XX:PrintCodeCacheDetails=buffer_blobs, or something similar.
Best,
Albert
On 27.08.2013 20:37, Vladimir Kozlov wrote:
> Albert,
>
> I like your new output but when is next info is produced?:
>
> Java methods = 67K (hdr 0%, loc 3%, code 41%, stub 0%, [oops 0%, data
> 0%, pcs 40%])
> Why you changed type to size_t? Please, don't do that. You will not be
> able to use %d format in 64-bit VM and we may loose atomicity of
> increment instructions on some platforms. Counters and sizes should
> fit int since current limit for CodeCache is 2Gb.
>
> Leave current flags (except PrintCodeCache2 which even I did know
> until recently) as they are because there are tests on them already
> and people know about them.
>
> I agree that we can use new PrintCodeCacheDetails flag for details
> selection. Use Christian suggestion (see NativeMemoryTracking but list
> all values in globals.hpp flag's description).
>
> Thanks,
> Vladimir
>
> On 8/26/13 11:45 PM, Albert Noll wrote:
>> Hi,
>>
>> Chris P., Vladimir, and Christian T., thanks for looking at the code.
>>
>> One note: We currently have a lot of -XX flags to provide information
>> about the code
>> cache (PrintCodeCache, PrintCodeCacheDetails, +Verbose,
>> PrintCodecacheOnCompilation,
>> and PrintCodeCacheAllocation.
>>
>> We could combine these flags (or at least some of them) into one Integer
>> flag that gives the
>> level of detail that is printed. For example, we could make
>> -XX:PrintCodeCache=int
>> where: (this suggestion is not contained in the current patch)
>>
>> int==0 -> no debug output
>> int==1 -> same debug output is PrintCodeCache
>> int==2 -> same debug output as PrintCodeCodeCacheDetails
>> int==3 -> same debug output as with int==1 and int==2 + Verbose
>> int==4 -> same debug output as with int==3 + PrintCodeCacheOnCompilation
>> int==5-> same debug output as with int==4 + PrintCodeCacheAllocation
>>
>> Having levels of detail that can be specified via a variable seems
>> intuitive to
>> me. Please let me know what you think.
>>
>> Here is the new webrev:
>> http://cr.openjdk.java.net/~anoll/8005885/webrev.01/
>> <http://cr.openjdk.java.net/%7Eanoll/8005885/webrev.01/>
>>
>>
>> I modified the patch so that the output is less verbose. Here are
>> samples of
>> the output:
>>
>>
>> java -XX:+PrintCodeCache ...
>> CodeCache: size=245760kB used=8862kB max_used=9120kB free=236897kB
>> Interpreter=799kB live_nmethods=1565(6894kB) dead_nmethods=34(196kB)
>> stubs=77(1518kB) adapters=212(159kB)
>> bounds [0x00007f1195000000, 0x00007f1195900000, 0x00007f11a4000000]
>> compilation: enabled
>>
>> java -XX:+PrintCodeCache -XX:+PrintCodeCacheDetails
>> CodeCache: size=245760kB used=9110kB max_used=9126kB free=236649kB
>> Interpreter=799kB live_nmethods=1607(7267kB) dead_nmethods=11(71kB)
>> stubs=77(1515kB) adapters=212(159kB)
>> bounds [0x00007f81cbb7c000, 0x00007f81cc47c000, 0x00007f81dab7c000]
>> compilation: enabled
>>
>> Code cache entries: (total of #1907)
>> nmethods: #1618 7338kB
>> Java: #1576 7295kB
>> Tier 1: #236 213kB
>> Tier 2: #792 3280kB
>> Tier 3: #358 2349kB
>> Tier 4: #190 1451kB
>> Native: #42 43kB
>> runtime-stubs: #59 44kB
>> adapters: #212 159kB
>> buffer blobs: #12 1467kB
>> deopt-stubs: #1 1kB
>> uncommon-traps: #1 0kB
>> others: #4 2kB
>>
>> nmethod state distribution
>> in-use: #1562 6596kB
>> not-entrant: #45 670kB
>> zombie: #11 71kB
>> unloaded: #0 0kB
>>
>> java -XX:+PrintCodeCache -XX:+PrintCodeCacheDetails -XX:+Verbose
>> CodeCache: size=245760kB used=8907kB max_used=8995kB free=236852kB
>> Interpreter=799kB live_nmethods=1584(6973kB) dead_nmethods=29(164kB)
>> stubs=78(1517kB) adapters=212(159kB)
>> bounds [0x00007f4c4bda8000, 0x00007f4c4c688000, 0x00007f4c5ada8000]
>> compilation: enabled
>>
>> Code cache entries: (total of #1903)
>> nmethods: #1613 7137kB
>> Java: #1572 7095kB
>> Tier 1: #245 221kB
>> Tier 2: #792 3261kB
>> Tier 3: #343 2238kB
>> Tier 4: #192 1373kB
>> Native: #41 42kB
>> runtime-stubs: #59 44kB
>> adapters: #212 159kB
>> buffer blobs: #13 1469kB
>> deopt-stubs: #1 1kB
>> uncommon-traps: #1 0kB
>> others: #4 2kB
>>
>> nmethod state distribution
>> in-use: #1537 6447kB
>> not-entrant: #47 526kB
>> zombie: #29 164kB
>> unloaded: #0 0kB
>>
>> nmethod size distribution (non-zombie java)
>> 0 - 512 bytes 556
>> 512 - 1024 bytes 318
>> 1024 - 1536 bytes 156
>> 1536 - 2048 bytes 109
>> 2048 - 2560 bytes 87
>> 2560 - 3072 bytes 56
>> 3072 - 3584 bytes 40
>> 3584 - 4096 bytes 32
>> 4096 - 4608 bytes 28
>> 4608 - 5120 bytes 31
>> 5120 - 5632 bytes 10
>> 5632 - 6144 bytes 12
>> 6144 - 6656 bytes 8
>> 6656 - 7168 bytes 11
>> 7168 - 7680 bytes 12
>> 7680 - 8192 bytes 14
>> 8192 - 8704 bytes 2
>> 8704 - 9216 bytes 7
>> 9216 - 9728 bytes 6
>> 9728 - 10240 bytes 4
>> 10240 - 10752 bytes 5
>> 10752 - 11264 bytes 7
>> 11264 - 11776 bytes 1
>> 11776 - 12288 bytes 2
>> 12288 - 12800 bytes 1
>> 12800 - 13312 bytes 3
>> 13312 - 13824 bytes 2
>> 13824 - 14336 bytes 3
>> 14336 - 14848 bytes 2
>> 14848 - 15360 bytes 6
>> 15360 - 15872 bytes 5
>> 15872 - 16384 bytes 1
>> 16896 - 17408 bytes 1
>> 17408 - 17920 bytes 1
>> 17920 - 18432 bytes 3
>> 18432 - 18944 bytes 2
>> 18944 - 19456 bytes 2
>> 19968 - 20480 bytes 2
>> 22016 - 22528 bytes 3
>> 22528 - 23040 bytes 1
>> 23552 - 24064 bytes 2
>> 25088 - 25600 bytes 1
>> 25600 - 26112 bytes 2
>> 26112 - 26624 bytes 1
>> 27136 - 27648 bytes 1
>> 27648 - 28160 bytes 1
>> 28672 - 29184 bytes 1
>> 30208 - 30720 bytes 1
>> 31744 - 32256 bytes 1
>> 35328 - 35840 bytes 1
>> 35840 - 36352 bytes 1
>> 36864 - 37376 bytes 1
>> 37376 - 37888 bytes 1
>> 41472 - 41984 bytes 1
>> 43520 - 44032 bytes 1
>> 47104 - 47616 bytes 1
>> 49664 - 50176 bytes 1
>> 58368 - 58880 bytes 1
>> nmethod dependency checking time 0.205156
>>
>> java -XX:+PrintCodeCacheOnCompilation
>> CodeCache: size=245760kB used=1610kB max_used=1610kB free=244149kB
>>
>> java -XX:PrintCodeCacheAllocation:
>> CodeCache allocation: addr: 0x00007f7ed1190850, size: 1984B
>>
>> Best,
>> Albert
>>
>> On 23.08.2013 21:24, Christian Thalinger wrote:
>>> On Aug 23, 2013, at 11:34 AM, Vladimir
>>> Kozlov<vladimir.kozlov at oracle.com> wrote:
>>>
>>>> Albert,
>>>>
>>>> I think you are missing constant section from nmethods or something
>>>> else because % are not add up to 100%. Can you also try on SPARC?
>>>>
>>>> #500 Java methods = 1119K (hdr 12%, loc 4%, code 38%, stub 2%,
>>>> [oops 0%, data 1%, pcs 24%])
>>>>
>>>> Use Kb when you show size.
>>> Actually, kB (http://en.wikipedia.org/wiki/Kilobyte).
>>>
>>>> I agree with Chris P. that output is too verbose for product *by
>>>> default* - I thinks we don't need splitting per tiers and detail
>>>> section sizes information. Note, this will be put into hs_err
>>>> files. I don't see how this additional information will help to
>>>> debug problems.
>>>>
>>>> I think by default PrintCodeCache should should sligthly verbose
>>>> current output: add sizes, split live/dead nmethods, add
>>>> interpreter and stubs instead of blobs and may be something else:
>>>>
>>>> CodeCache: size=49152Kb used=1729Kb max_used=1782Kb free=47422Kb
>>>> bounds [0x00007fbc13fb1000, 0x00007fbc14221000, 0x00007fbc16fb1000]
>>>> interpreter=199Kb stubs=302(400Kb) live_nmethods=510(1119Kb)
>>>> dead_nmethods=2(85Kb) adapters=254(116Kb)
>>>> compilation: enabled
>>>>
>>>>
>>>> It could be important when you are working on codecache improvement
>>>> or want to see how it is used. And, yes it would be nice to get
>>>> detailed info in product VM also. Unfortunately Verbose flag is not
>>>> product and in some places it guards code which we don't want in
>>>> product VM so we can't make it product.
>>>>
>>>> Maybe you can convert PrintCodeCache2 to 'diagnostic' flag and use
>>>> it for detailed information (it already does additional printing).
>>>> But add new develop flag to replace it in print_trace():
>>>>
>>>> void CodeCache::print_trace(const char* event, CodeBlob* cb, int
>>>> size) {
>>>> if (PrintCodeCache2) { // Need to add a new flag
>>> I agree with Vladimir. Can you also rename PrintCodeCache2 to
>>> PrintCodeCacheDetails?
>>>
>>> -- Chris
>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 8/23/13 6:31 AM, Albert Noll wrote:
>>>>> Hi,
>>>>>
>>>>> could I get reviews for the following patch?
>>>>>
>>>>> jbs:https://bugs.openjdk.java.net/browse/JDK-8005885
>>>>> webrev:http://cr.openjdk.java.net/~anoll/8005885/webrev.00/
>>>>> <http://cr.openjdk.java.net/%7Eanoll/8005885/webrev.00/>
>>>>>
>>>>> Many thanks in advance,
>>>>> Albert
>>>>>
>>>>> Problem: Currently we only print:
>>>>>
>>>>> CodeCache:
>>>>> nmethod dependency checking time 0.037552
>>>>> #944 live = 3681K (hdr 4%, loc 2%, code 74%, stub 1%, [oops 0%,
>>>>> data
>>>>> 0%, pcs 5%])
>>>>> #92 dead = 247K (hdr 7%, loc 6%, code 46%, stub 3%, [oops 0%,
>>>>> data 0%,
>>>>> pcs 12%])
>>>>>
>>>>> It would be helpful to have more detailed information about the
>>>>> code cache.
>>>>>
>>>>> Solution: add more detailed information about the content of the code
>>>>> cache when
>>>>> exiting the VM.
>>>>>
>>>>>
>>>>> See sample outputs:
>>>>> java -XX:-TieredCompilation -XX:+PrintCodeCache -jar
>>>>> dacapo-9.12-bach.jar fop
>>>>> ===== DaCapo 9.12 fop starting =====
>>>>> ===== DaCapo 9.12 fop PASSED in 7676 msec =====
>>>>> CodeCache: size=49152Kb used=1729Kb max_used=1782Kb free=47422Kb
>>>>> bounds [0x00007fbc13fb1000, 0x00007fbc14221000, 0x00007fbc16fb1000]
>>>>> total_blobs=814 nmethods=512 adapters=254
>>>>> compilation: enabled
>>>>>
>>>>> Interpreter: total=199k, used=124k
>>>>>
>>>>> Total number of live methods: 510
>>>>> Tier 1:
>>>>> #0 Java methods
>>>>> #0 OSR methods
>>>>> Tier 2:
>>>>> #0 Java methods
>>>>> #0 OSR methods
>>>>> Tier 3:
>>>>> #0 Java methods
>>>>> #0 OSR methods
>>>>> Tier 4:
>>>>> #500 Java methods = 1119K (hdr 12%, loc 4%, code 38%, stub 2%,
>>>>> [oops
>>>>> 0%, data 1%, pcs 24%])
>>>>> #5 OSR methods = 37K (hdr 3%, loc 3%, code 41%, stub 2%, [oops
>>>>> 0%,
>>>>> data 1%, pcs 30%])
>>>>> Native methods:
>>>>> #5 Native methods = 4K (hdr 31%, loc 10%, code 55%, stub 0%,
>>>>> [oops
>>>>> 0%, data 0%, pcs 0%])
>>>>>
>>>>> Total number of dead methods: 2
>>>>> Tier 1:
>>>>> #0 Java methods
>>>>> #0 OSR methods
>>>>> Tier 2:
>>>>> #0 Java methods
>>>>> #0 OSR methods
>>>>> Tier 3:
>>>>> #0 Java methods
>>>>> #0 OSR methods
>>>>> Tier 4:
>>>>> #1 Java methods = 67K (hdr 0%, loc 3%, code 41%, stub 0%,
>>>>> [oops 0%,
>>>>> data 0%, pcs 40%])
>>>>> #1 OSR methods = 18K (hdr 1%, loc 4%, code 39%, stub 2%, [oops
>>>>> 0%,
>>>>> data 1%, pcs 31%])
>>>>> Native methods:
>>>>> #0 Native methods
>>>>>
>>>>> Total number of stubs: 302
>>>>> #23 runtime = 5K (hdr 25%, loc 1%, code 67%, stub 0%, [oops
>>>>> 0%, data
>>>>> 0%, pcs 0%])
>>>>> #1 deoptimization = 1K (hdr 7%, loc 0%, code 89%, stub 0%,
>>>>> [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #1 uncommon trap = 0K (hdr 13%, loc 1%, code 83%, stub 0%,
>>>>> [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #1 exception = 0K (hdr 27%, loc 3%, code 65%, stub 0%, [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #3 safepoint = 1K (hdr 10%, loc 1%, code 86%, stub 0%, [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #254 C2I/I2C adapter = 116K (hdr 13%, loc 1%, code 82%, stub 0%,
>>>>> [oops 0%, data 0%, pcs 0%])
>>>>> #0 method handles adapter
>>>>> #19 other = 308K (hdr 0%, loc 0%, code 99%, stub 0%, [oops 0%,
>>>>> data
>>>>> 0%, pcs 0%])
>>>>>
>>>>>
>>>>> java -XX:+TieredCompilation -XX:+PrintCodeCache -jar
>>>>> dacapo-9.12-bach.jar fop
>>>>> ===== DaCapo 9.12 fop starting =====
>>>>> ===== DaCapo 9.12 fop PASSED in 4850 msec =====
>>>>> CodeCache: size=245760Kb used=7355Kb max_used=7370Kb free=238404Kb
>>>>> bounds [0x00007f49fd000000, 0x00007f49fd740000, 0x00007f4a0c000000]
>>>>> total_blobs=2578 nmethods=2240 adapters=254
>>>>> compilation: enabled
>>>>>
>>>>> Interpreter: total=199k, used=123k
>>>>>
>>>>> Total number of live methods: 2205
>>>>> Tier 1:
>>>>> #466 Java methods = 333K (hdr 39%, loc 5%, code 25%, stub 19%,
>>>>> [oops
>>>>> 0%, data 0%, pcs 1%])
>>>>> #0 OSR methods
>>>>> Tier 2:
>>>>> #429 Java methods = 1047K (hdr 11%, loc 5%, code 40%, stub 8%,
>>>>> [oops
>>>>> 0%, data 1%, pcs 18%])
>>>>> #0 OSR methods
>>>>> Tier 3:
>>>>> #1069 Java methods = 3817K (hdr 7%, loc 4%, code 53%, stub 5%,
>>>>> [oops
>>>>> 0%, data 0%, pcs 17%])
>>>>> #5 OSR methods = 92K (hdr 1%, loc 4%, code 57%, stub 2%, [oops
>>>>> 0%,
>>>>> data 0%, pcs 23%])
>>>>> Tier 4:
>>>>> #190 Java methods = 648K (hdr 8%, loc 3%, code 43%, stub 1%,
>>>>> [oops
>>>>> 0%, data 1%, pcs 26%])
>>>>> #3 OSR methods = 15K (hdr 5%, loc 3%, code 47%, stub 1%, [oops
>>>>> 0%,
>>>>> data 0%, pcs 28%])
>>>>> Native methods:
>>>>> #43 Native methods = 37K (hdr 31%, loc 10%, code 55%, stub 0%,
>>>>> [oops
>>>>> 0%, data 0%, pcs 0%])
>>>>>
>>>>> Total number of dead methods: 35
>>>>> Tier 1:
>>>>> #0 Java methods
>>>>> #0 OSR methods
>>>>> Tier 2:
>>>>> #3 Java methods = 22K (hdr 3%, loc 5%, code 40%, stub 6%,
>>>>> [oops 0%,
>>>>> data 1%, pcs 27%])
>>>>> #0 OSR methods
>>>>> Tier 3:
>>>>> #31 Java methods = 193K (hdr 4%, loc 4%, code 54%, stub 3%, [oops
>>>>> 0%, data 0%, pcs 21%])
>>>>> #1 OSR methods = 2K (hdr 11%, loc 5%, code 52%, stub 7%, [oops
>>>>> 0%,
>>>>> data 0%, pcs 7%])
>>>>> Tier 4:
>>>>> #0 Java methods
>>>>> #0 OSR methods
>>>>> Native methods:
>>>>> #0 Native methods
>>>>>
>>>>> Total number of stubs: 338
>>>>> #57 runtime = 24K (hdr 14%, loc 3%, code 79%, stub 0%, [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #1 deoptimization = 1K (hdr 7%, loc 0%, code 89%, stub 0%,
>>>>> [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #1 uncommon trap = 0K (hdr 13%, loc 1%, code 83%, stub 0%,
>>>>> [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #1 exception = 0K (hdr 27%, loc 3%, code 65%, stub 0%, [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #3 safepoint = 1K (hdr 10%, loc 1%, code 86%, stub 0%, [oops 0%,
>>>>> data 0%, pcs 0%])
>>>>> #254 C2I/I2C adapter = 116K (hdr 13%, loc 1%, code 82%, stub 0%,
>>>>> [oops 0%, data 0%, pcs 0%])
>>>>> #0 method handles adapter
>>>>> #21 other = 871K (hdr 0%, loc 0%, code 99%, stub 0%, [oops 0%,
>>>>> data
>>>>> 0%, pcs 0%])
>>>>>
>>>>>
>>>>> Please let me know if you think more/less/different data should be
>>>>> printed.
>>
More information about the hotspot-compiler-dev
mailing list