RFR(L): 8198691: CodeHeap State Analytics

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Mar 27 14:42:59 UTC 2018


On 3/27/18 2:30 AM, Schmidt, Lutz wrote:
> Hi Vladimir,
> thank you for testing and pushing!
> 
> I will spend some thoughts on how to automatically test functionality, open an RFE (https://bugs.openjdk.java.net/browse/JDK-8200296), and create some tests. But, please, give me some days grace period before you expect an RFR.

Okay, no problem.

Vladimir

> 
> Thank you,
> Lutz
> 
> On 26.03.18, 21:58, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> 
>      All tests passed! I pushed changes.
>      
>      But we forgot to add tests to check functionality. Please, file new RFE
>      and add few jtreg tests. There are exmaples in compiler/codecache/cli/
>      and other places.
>      
>      Thanks,
>      Vladimir
>      
>      On 3/26/18 9:46 AM, Vladimir Kozlov wrote:
>      > On 3/26/18 6:57 AM, Schmidt, Lutz wrote:
>      >> Thank you Vladimir!
>      >>
>      >> The missing ttyLocker sneaked out of the code unintentionally and
>      >> unnoticed. Sorry for that. I have put it back in.
>      >
>      > Okay.
>      >
>      >>
>      >>  From our experience, it is very helpful to have NMethodSweeper
>      >> information available whenever you look at CodeHeap state analytics. I
>      >> changed the code in java.cpp such that NMethodSweeper::print(out)
>      >> isn't called twice.
>      >
>      > Good.
>      >
>      >>
>      >> I have created a new webrev at
>      >> http://cr.openjdk.java.net/~lucy/webrevs/8198691.03/index.html
>      >
>      > This looks good. I will start testing this version.
>      >
>      > Thanks,
>      > Vladimir
>      >
>      >>
>      >> Thank you!
>      >> Lutz
>      >>
>      >>
>      >> On 23.03.18, 23:40, "Vladimir Kozlov" <vladimir.kozlov at oracle.com>
>      >> wrote:
>      >>
>      >>      This looks good.
>      >>      NMethodSweeper::print(out) may be called twice in java.cpp
>      >> because print_heapinfo() also calls it
>      >>      through print_info().
>      >>      You removed ttyLocker from NMethodSweeper::print() and you don't
>      >> have lock in
>      >>      CompileBroker::print_info() which is significant output block.
>      >>      Other places are fine AFAIS. Thank you for fixing coding style.
>      >>      Thanks,
>      >>      Vladimir
>      >>      On 3/23/18 9:30 AM, Schmidt, Lutz wrote:
>      >>      > Hi Vladimir, Tobias,
>      >>      >
>      >>      > I have worked on your comments quite some time. There were
>      >> changes to
>      >>      >   - share/code/codeCache.cpp
>      >>      >   - share/code/codeHeapState.cpp
>      >>      >   - share/compiler/compileBroker.cpp
>      >>      >   - share/memory/heap.hpp
>      >>      >   - share/runtime/java.cpp
>      >>      >
>      >>      > Here is a summary of what I changed/reworked/adapted:
>      >>      >   - The lock order problem is solved.
>      >>      >   - The CodeHeapStateAnalytics_lock
>      >>      >      o is acquired before the "aggregate" step is begun.
>      >>      >      o is held continuously during the aggregate and print
>      >> function.
>      >>      >      o is released at function return (after all work is done.
>      >>      >      o just protects from modification of the static variables
>      >> by other threads.
>      >>      >   - The CodeCache_lock
>      >>      >       o is acquired after the CodeHeapStateAnalytics_lock and
>      >> only if an "aggregate" step is to be performed.
>      >>      >       o hold time was never observed to be more than one
>      >> second. Not a guarantee, though.
>      >>      >   - The tty_lock is never acquired during the "aggregate" step,
>      >> so there is no interference with the CodeCache_lock.
>      >>      >   - In the print* functions, blocks that need to stay together
>      >> are first composed into a bufferedStream (size 4k). They are then
>      >> printed to the given outputStream under tty_lock.
>      >>      >   - The remaining out->print_cr() are left by intention. They
>      >> print diagnostic info if some internal inconsistency is found.
>      >>      >   - The OpenJDK code style wrt. if-then-else should now be
>      >> respected everywhere.
>      >>      >   - The commented lines you mentioned (codeHeapState.cpp/.hpp)
>      >> are gone.
>      >>      >   - The "coding alternatives" for printing to the log stream
>      >> are gone.
>      >>      >
>      >>      > SAP-internal testing against SAP JVM did not reveal any problems.
>      >>      > Testing OpenJDK (jdk/hs repo, linuxx86_64) was all green. Other
>      >> platforms did not run due to system issues.
>      >>      >
>      >>      > There is a new webrev at
>      >> http://cr.openjdk.java.net/~lucy/webrevs/8198691.02/
>      >>      >
>      >>      > Thanks for spending some time, again, on this RFR.
>      >>      >
>      >>      > Best regards,
>      >>      > Lutz
>      >>      >
>      >>      > On 20.03.18, 23:01, "Vladimir Kozlov"
>      >> <vladimir.kozlov at oracle.com> wrote:
>      >>      >
>      >>      >      As I remember we are trying to lock tty outside print
>      >> functions.
>      >>      >
>      >>      >      Yes, it could be troublesome if it is Mbs of output.
>      >> Especially when you do it for "full codecache"
>      >>      >      event when VM is still running. You also have
>      >> CodeCache_lock in print_heapinfo() and it would not be
>      >>      >      good to hold both locks at the same time. I think to have
>      >> "micro locking" (with comments) in
>      >>      >      print_heapinfo() is better then to have lock in each print
>      >> function.
>      >>      >
>      >>      >      Vladimir
>      >>      >
>      >>      >      On 3/20/18 11:57 AM, Schmidt, Lutz wrote:
>      >>      >      > Hi Vladimir,
>      >>      >      > I already saw that code but was a little hesitant to
>      >> code the same way. Why? In my case, the stringStream buffer could
>      >> become fairly large. Actual size depends on CodeHeap size and contents
>      >> as well as printing parameters. If you tell me some MB are OK, I can
>      >> change my code.
>      >>      >      > Thanks,
>      >>      >      > Lutz
>      >>      >      >
>      >>      >      > On 20.03.18, 19:42, "Vladimir Kozlov"
>      >> <vladimir.kozlov at oracle.com> wrote:
>      >>      >      >
>      >>      >      >      I think you should follow what we do with
>      >> CodeCache::print_summary():
>      >>      >      >
>      >>      >      >
>      >> http://hg.openjdk.java.net/jdk/hs/file/74db2b7cec75/src/hotspot/share/code/codeCache.cpp#l1359
>      >>
>      >>      >      >
>      >>      >      >      First, print into local buffer stringStream and
>      >> then lock tty when print that buffer.
>      >>      >      >
>      >>      >      >      Thanks,
>      >>      >      >      Vladimir
>      >>      >      >
>      >>      >      >      On 3/20/18 11:29 AM, Schmidt, Lutz wrote:
>      >>      >      >      > Hi Tobias,
>      >>      >      >      >
>      >>      >      >      > thank you for uncovering this. In
>      >> CodeCache::report_codemem_full() I oversaw that the tty lock is held
>      >> at the place I inserted the call to CompileBroker::print_heapinfo().
>      >>      >      >      >
>      >>      >      >      > That bug triggered some thoughts in my brain,
>      >> resulting in a question or two:
>      >>      >      >      >
>      >>      >      >      > Given the complex output of
>      >> CompileBroker::print_heapinfo(), what would be the OpenJDK approach to
>      >> tty locking?
>      >>      >      >      >
>      >>      >      >      > Should I do "micro locking", trying to keep
>      >> together only small blocks? That's what is implemented now.
>      >>      >      >      > Should I lock tty before each call to a print
>      >> function (like print_usedSpace, print_freeSpace, ...)?
>      >>      >      >      >
>      >>      >      >      > Either approach has its advantages, so I'm more
>      >> or less neutral. What do you all think?
>      >>      >      >      >
>      >>      >      >      > Depending on what's in favor by the community, I
>      >> will move the locks accordingly.
>      >>      >      >      >
>      >>      >      >      > Thanks,
>      >>      >      >      > Lutz
>      >>      >      >      >
>      >>      >      >      >
>      >>      >      >      > On 20.03.18, 15:45, "Tobias Hartmann"
>      >> <tobias.hartmann at oracle.com> wrote:
>      >>      >      >      >
>      >>      >      >      >      Hi Lutz,
>      >>      >      >      >
>      >>      >      >      >      I've already started testing with
>      >> -Xlog:codecache=Debug and found a problem:
>      >>      >      >      >
>      >>      >      >      >      The following tests
>      >>      >      >      >      compiler/whitebox/AllocationCodeBlobTest.java
>      >>      >      >      >      compiler/codecache/OverflowCodeCacheTest.java
>      >>      >      >      >
>      >> compiler/codecache/stress/ReturnBlobToWrongHeapTest.java
>      >>      >      >      >
>      >> compiler/codecache/stress/RandomAllocationTest.java
>      >>      >      >      >
>      >> compiler/profiling/spectrapredefineclass_classloaders/Launcher.java
>      >>      >      >      >
>      >> compiler/profiling/spectrapredefineclass/Launcher.java
>      >>      >      >      >
>      >>      >      >      >      fail with
>      >>      >      >      >      #  fatal error: acquiring lock
>      >> CodeHeapStateAnalytics_lock/6 out of order with lock tty_lock/0 --
>      >>      >      >      >      possible deadlock
>      >>      >      >      >
>      >>      >      >      >      Let me know if you need more information to
>      >> reproduce!
>      >>      >      >      >
>      >>      >      >      >      Thanks,
>      >>      >      >      >      Tobias
>      >>      >      >      >
>      >>      >      >      >      On 20.03.2018 11:25, Schmidt, Lutz wrote:
>      >>      >      >      >      > Hi Tobias,
>      >>      >      >      >      >
>      >>      >      >      >      > thank you! If you haven't started yet, you
>      >> may want to wait with testing a moment. I will remove the comments
>      >> Vladimir and you complained about and update the webrev. It's comments
>      >> only, but you never know...
>      >>      >      >      >      >
>      >>      >      >      >      > Thanks,
>      >>      >      >      >      > Lutz
>      >>      >      >      >      >
>      >>      >      >      >      > On 20.03.18, 10:46, "Tobias Hartmann"
>      >> <tobias.hartmann at oracle.com> wrote:
>      >>      >      >      >      >
>      >>      >      >      >      >     Hi Lutz,
>      >>      >      >      >      >
>      >>      >      >      >      >     very nice work! Thanks for
>      >> incorporating the requested changes. I think you can remove the commented
>      >>      >      >      >      >     LogStream code.
>      >>      >      >      >      >
>      >>      >      >      >      >     I'll re-run the tests that failed with
>      >> the last webrev.
>      >>      >      >      >      >
>      >>      >      >      >      >     Best regards,
>      >>      >      >      >      >     Tobias
>      >>      >      >      >      >
>      >>      >      >      >      >     On 19.03.2018 17:00, Schmidt, Lutz wrote:
>      >>      >      >      >      >     > Dear all,
>      >>      >      >      >      >     >
>      >>      >      >      >      >     > this is the next (second) iteration
>      >> of my CodeHeap State Analytics effort. It reflects all the comments
>      >> and suggestions I received on my initial RFR (sent out on March 1st).
>      >> Please read on to learn what was changed and what kept as is.
>      >>      >      >      >      >     >
>      >>      >      >      >      >     > May I please request reviews for
>      >>      >      >      >      >     >
>      >>      >      >      >      >     > Bug:
>      >> https://bugs.openjdk.java.net/browse/JDK-8198691
>      >>      >      >      >      >     > Webrev:
>      >> http://cr.openjdk.java.net/~lucy/webrevs/8198691.01/
>      >>      >      >      >      >     >
>      >>      >      >      >      >     > Instead of keeping the long tail of
>      >> comments and responses, I decided to provide a summary of what happened.
>      >>      >      >      >      >     >  - Most of the new code was moved to
>      >> new files: share/code/codeHeapState.cpp and share/code/codeHeapState.hpp
>      >>      >      >      >      >     >  - I have added, as requested, an
>      >> abbreviated version of the "General Description" chapter to
>      >> codeHeapState.cpp
>      >>      >      >      >      >     >  - In case of SegmentedCodeCache,
>      >> the iteration is limited to FOR_ALL_ALLOCABLE_HEAPS(). There were
>      >> issues in aot tests when using FOR_ALL_HEAPS().
>      >>      >      >      >      >     >  - All references to the RFE id
>      >> should be gone.
>      >>      >      >      >      >     >  - In share/runtime/java.cpp, the
>      >> call to CompileBroker::print_heapinfo() now is close to
>      >> "PrintCodeCache" for both, product and nonproduct cases.
>      >>      >      >      >      >     >  - The edited/updated documentation
>      >> is available as an attachment to the bug (in PDF format).
>      >>      >      >      >      >     >  - I added code to
>      >> share/code/codeCache.cpp (report_codemem_full()) to print the CodeHeap
>      >> state for the first occurrence of the "full" condition.
>      >>      >      >      >      >     >  - The code style "hickups", noted
>      >> by Tobias Hartmann, are gone.
>      >>      >      >      >      >     >  - The compile time warnings and
>      >> errors are resolved.
>      >>      >      >      >      >     >
>      >>      >      >      >      >     > -XX:+PrintCodeHeapState vs.
>      >> -Xlog:codecache=Trace
>      >>      >      >      >      >     > I clearly understand and support the
>      >> intention to get rid of the Print* command line arguments. Therefore,
>      >> the PrintCodeHeapState command line argument is gone. You can request
>      >> the CodeHeap state analytics with the -Xlog:codecache=Trace (vm
>      >> shutdown) or -Xlog:codecache=Debug (CodeCache full and vm shutdown)
>      >> switches. The output is directed to tty, not to the log stream.
>      >>      >      >      >      >     >
>      >>      >      >      >      >     > The reason for not using the log
>      >> stream is simple: UL prefixes every line with a timestamp and the
>      >> trace tags. Unfortunately, that messes up my formatting big time. The
>      >> jcmd output, on the other hand, will not have the UL prefixes. I would
>      >> have to distinguish between UL and jcmd output when formatting. In
>      >> addition, I do not see a benefit from adding the same UL prefix to
>      >> thousands of lines.
>      >>      >      >      >      >     >
>      >>      >      >      >      >     > Comments are very welcome!
>      >>      >      >      >      >     >
>      >>      >      >      >      >     > Best Regards,
>      >>      >      >      >      >     > Lutz
>      >>      >      >      >      >     >
>      >>      >      >      >      >     >
>      >>      >      >      >      >     >
>      >>      >      >      >      >     >
>      >>      >      >      >      >
>      >>      >      >      >      >
>      >>      >      >      >
>      >>      >      >      >
>      >>      >      >
>      >>      >      >
>      >>      >
>      >>      >
>      >>
>      
> 


More information about the hotspot-compiler-dev mailing list