RFR(L): 8198691: CodeHeap State Analytics
Schmidt, Lutz
lutz.schmidt at sap.com
Tue Mar 27 09:30:35 UTC 2018
Hi Vladimir,
thank you for testing and pushing!
I will spend some thoughts on how to automatically test functionality, open an RFE (https://bugs.openjdk.java.net/browse/JDK-8200296), and create some tests. But, please, give me some days grace period before you expect an RFR.
Thank you,
Lutz
On 26.03.18, 21:58, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
All tests passed! I pushed changes.
But we forgot to add tests to check functionality. Please, file new RFE
and add few jtreg tests. There are exmaples in compiler/codecache/cli/
and other places.
Thanks,
Vladimir
On 3/26/18 9:46 AM, Vladimir Kozlov wrote:
> On 3/26/18 6:57 AM, Schmidt, Lutz wrote:
>> Thank you Vladimir!
>>
>> The missing ttyLocker sneaked out of the code unintentionally and
>> unnoticed. Sorry for that. I have put it back in.
>
> Okay.
>
>>
>> From our experience, it is very helpful to have NMethodSweeper
>> information available whenever you look at CodeHeap state analytics. I
>> changed the code in java.cpp such that NMethodSweeper::print(out)
>> isn't called twice.
>
> Good.
>
>>
>> I have created a new webrev at
>> http://cr.openjdk.java.net/~lucy/webrevs/8198691.03/index.html
>
> This looks good. I will start testing this version.
>
> Thanks,
> Vladimir
>
>>
>> Thank you!
>> Lutz
>>
>>
>> On 23.03.18, 23:40, "Vladimir Kozlov" <vladimir.kozlov at oracle.com>
>> wrote:
>>
>> This looks good.
>> NMethodSweeper::print(out) may be called twice in java.cpp
>> because print_heapinfo() also calls it
>> through print_info().
>> You removed ttyLocker from NMethodSweeper::print() and you don't
>> have lock in
>> CompileBroker::print_info() which is significant output block.
>> Other places are fine AFAIS. Thank you for fixing coding style.
>> Thanks,
>> Vladimir
>> On 3/23/18 9:30 AM, Schmidt, Lutz wrote:
>> > Hi Vladimir, Tobias,
>> >
>> > I have worked on your comments quite some time. There were
>> changes to
>> > - share/code/codeCache.cpp
>> > - share/code/codeHeapState.cpp
>> > - share/compiler/compileBroker.cpp
>> > - share/memory/heap.hpp
>> > - share/runtime/java.cpp
>> >
>> > Here is a summary of what I changed/reworked/adapted:
>> > - The lock order problem is solved.
>> > - The CodeHeapStateAnalytics_lock
>> > o is acquired before the "aggregate" step is begun.
>> > o is held continuously during the aggregate and print
>> function.
>> > o is released at function return (after all work is done.
>> > o just protects from modification of the static variables
>> by other threads.
>> > - The CodeCache_lock
>> > o is acquired after the CodeHeapStateAnalytics_lock and
>> only if an "aggregate" step is to be performed.
>> > o hold time was never observed to be more than one
>> second. Not a guarantee, though.
>> > - The tty_lock is never acquired during the "aggregate" step,
>> so there is no interference with the CodeCache_lock.
>> > - In the print* functions, blocks that need to stay together
>> are first composed into a bufferedStream (size 4k). They are then
>> printed to the given outputStream under tty_lock.
>> > - The remaining out->print_cr() are left by intention. They
>> print diagnostic info if some internal inconsistency is found.
>> > - The OpenJDK code style wrt. if-then-else should now be
>> respected everywhere.
>> > - The commented lines you mentioned (codeHeapState.cpp/.hpp)
>> are gone.
>> > - The "coding alternatives" for printing to the log stream
>> are gone.
>> >
>> > SAP-internal testing against SAP JVM did not reveal any problems.
>> > Testing OpenJDK (jdk/hs repo, linuxx86_64) was all green. Other
>> platforms did not run due to system issues.
>> >
>> > There is a new webrev at
>> http://cr.openjdk.java.net/~lucy/webrevs/8198691.02/
>> >
>> > Thanks for spending some time, again, on this RFR.
>> >
>> > Best regards,
>> > Lutz
>> >
>> > On 20.03.18, 23:01, "Vladimir Kozlov"
>> <vladimir.kozlov at oracle.com> wrote:
>> >
>> > As I remember we are trying to lock tty outside print
>> functions.
>> >
>> > Yes, it could be troublesome if it is Mbs of output.
>> Especially when you do it for "full codecache"
>> > event when VM is still running. You also have
>> CodeCache_lock in print_heapinfo() and it would not be
>> > good to hold both locks at the same time. I think to have
>> "micro locking" (with comments) in
>> > print_heapinfo() is better then to have lock in each print
>> function.
>> >
>> > Vladimir
>> >
>> > On 3/20/18 11:57 AM, Schmidt, Lutz wrote:
>> > > Hi Vladimir,
>> > > I already saw that code but was a little hesitant to
>> code the same way. Why? In my case, the stringStream buffer could
>> become fairly large. Actual size depends on CodeHeap size and contents
>> as well as printing parameters. If you tell me some MB are OK, I can
>> change my code.
>> > > Thanks,
>> > > Lutz
>> > >
>> > > On 20.03.18, 19:42, "Vladimir Kozlov"
>> <vladimir.kozlov at oracle.com> wrote:
>> > >
>> > > I think you should follow what we do with
>> CodeCache::print_summary():
>> > >
>> > >
>> http://hg.openjdk.java.net/jdk/hs/file/74db2b7cec75/src/hotspot/share/code/codeCache.cpp#l1359
>>
>> > >
>> > > First, print into local buffer stringStream and
>> then lock tty when print that buffer.
>> > >
>> > > Thanks,
>> > > Vladimir
>> > >
>> > > On 3/20/18 11:29 AM, Schmidt, Lutz wrote:
>> > > > Hi Tobias,
>> > > >
>> > > > thank you for uncovering this. In
>> CodeCache::report_codemem_full() I oversaw that the tty lock is held
>> at the place I inserted the call to CompileBroker::print_heapinfo().
>> > > >
>> > > > That bug triggered some thoughts in my brain,
>> resulting in a question or two:
>> > > >
>> > > > Given the complex output of
>> CompileBroker::print_heapinfo(), what would be the OpenJDK approach to
>> tty locking?
>> > > >
>> > > > Should I do "micro locking", trying to keep
>> together only small blocks? That's what is implemented now.
>> > > > Should I lock tty before each call to a print
>> function (like print_usedSpace, print_freeSpace, ...)?
>> > > >
>> > > > Either approach has its advantages, so I'm more
>> or less neutral. What do you all think?
>> > > >
>> > > > Depending on what's in favor by the community, I
>> will move the locks accordingly.
>> > > >
>> > > > Thanks,
>> > > > Lutz
>> > > >
>> > > >
>> > > > On 20.03.18, 15:45, "Tobias Hartmann"
>> <tobias.hartmann at oracle.com> wrote:
>> > > >
>> > > > Hi Lutz,
>> > > >
>> > > > I've already started testing with
>> -Xlog:codecache=Debug and found a problem:
>> > > >
>> > > > The following tests
>> > > > compiler/whitebox/AllocationCodeBlobTest.java
>> > > > compiler/codecache/OverflowCodeCacheTest.java
>> > > >
>> compiler/codecache/stress/ReturnBlobToWrongHeapTest.java
>> > > >
>> compiler/codecache/stress/RandomAllocationTest.java
>> > > >
>> compiler/profiling/spectrapredefineclass_classloaders/Launcher.java
>> > > >
>> compiler/profiling/spectrapredefineclass/Launcher.java
>> > > >
>> > > > fail with
>> > > > # fatal error: acquiring lock
>> CodeHeapStateAnalytics_lock/6 out of order with lock tty_lock/0 --
>> > > > possible deadlock
>> > > >
>> > > > Let me know if you need more information to
>> reproduce!
>> > > >
>> > > > Thanks,
>> > > > Tobias
>> > > >
>> > > > On 20.03.2018 11:25, Schmidt, Lutz wrote:
>> > > > > Hi Tobias,
>> > > > >
>> > > > > thank you! If you haven't started yet, you
>> may want to wait with testing a moment. I will remove the comments
>> Vladimir and you complained about and update the webrev. It's comments
>> only, but you never know...
>> > > > >
>> > > > > Thanks,
>> > > > > Lutz
>> > > > >
>> > > > > On 20.03.18, 10:46, "Tobias Hartmann"
>> <tobias.hartmann at oracle.com> wrote:
>> > > > >
>> > > > > Hi Lutz,
>> > > > >
>> > > > > very nice work! Thanks for
>> incorporating the requested changes. I think you can remove the commented
>> > > > > LogStream code.
>> > > > >
>> > > > > I'll re-run the tests that failed with
>> the last webrev.
>> > > > >
>> > > > > Best regards,
>> > > > > Tobias
>> > > > >
>> > > > > On 19.03.2018 17:00, Schmidt, Lutz wrote:
>> > > > > > Dear all,
>> > > > > >
>> > > > > > this is the next (second) iteration
>> of my CodeHeap State Analytics effort. It reflects all the comments
>> and suggestions I received on my initial RFR (sent out on March 1st).
>> Please read on to learn what was changed and what kept as is.
>> > > > > >
>> > > > > > May I please request reviews for
>> > > > > >
>> > > > > > Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8198691
>> > > > > > Webrev:
>> http://cr.openjdk.java.net/~lucy/webrevs/8198691.01/
>> > > > > >
>> > > > > > Instead of keeping the long tail of
>> comments and responses, I decided to provide a summary of what happened.
>> > > > > > - Most of the new code was moved to
>> new files: share/code/codeHeapState.cpp and share/code/codeHeapState.hpp
>> > > > > > - I have added, as requested, an
>> abbreviated version of the "General Description" chapter to
>> codeHeapState.cpp
>> > > > > > - In case of SegmentedCodeCache,
>> the iteration is limited to FOR_ALL_ALLOCABLE_HEAPS(). There were
>> issues in aot tests when using FOR_ALL_HEAPS().
>> > > > > > - All references to the RFE id
>> should be gone.
>> > > > > > - In share/runtime/java.cpp, the
>> call to CompileBroker::print_heapinfo() now is close to
>> "PrintCodeCache" for both, product and nonproduct cases.
>> > > > > > - The edited/updated documentation
>> is available as an attachment to the bug (in PDF format).
>> > > > > > - I added code to
>> share/code/codeCache.cpp (report_codemem_full()) to print the CodeHeap
>> state for the first occurrence of the "full" condition.
>> > > > > > - The code style "hickups", noted
>> by Tobias Hartmann, are gone.
>> > > > > > - The compile time warnings and
>> errors are resolved.
>> > > > > >
>> > > > > > -XX:+PrintCodeHeapState vs.
>> -Xlog:codecache=Trace
>> > > > > > I clearly understand and support the
>> intention to get rid of the Print* command line arguments. Therefore,
>> the PrintCodeHeapState command line argument is gone. You can request
>> the CodeHeap state analytics with the -Xlog:codecache=Trace (vm
>> shutdown) or -Xlog:codecache=Debug (CodeCache full and vm shutdown)
>> switches. The output is directed to tty, not to the log stream.
>> > > > > >
>> > > > > > The reason for not using the log
>> stream is simple: UL prefixes every line with a timestamp and the
>> trace tags. Unfortunately, that messes up my formatting big time. The
>> jcmd output, on the other hand, will not have the UL prefixes. I would
>> have to distinguish between UL and jcmd output when formatting. In
>> addition, I do not see a benefit from adding the same UL prefix to
>> thousands of lines.
>> > > > > >
>> > > > > > Comments are very welcome!
>> > > > > >
>> > > > > > Best Regards,
>> > > > > > Lutz
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>>
More information about the hotspot-compiler-dev
mailing list