RFR(S): 8027593: performance drop with constrained codecache starting with hs25 b111
Srinivas Ramakrishna
ysr1729 at gmail.com
Fri Nov 8 12:59:37 PST 2013
Thanks Vladimir (and Albert) for all of the details. This is immensely
helpful to me for working around this issue in 7uxx.
-- ramki
On Fri, Nov 8, 2013 at 10:08 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com
> wrote:
> Hi, Ramki
>
> I doubt we will backport to 7uXX CodeCache work we did in jdk8. We start
> working on jdk9 soon and jdk7u is moved to sustaining state. Unless our
> main 'customer' ask us to do that but I did not hear complains from them.
>
> For 7u the only solution is increase reserved CodeCache and use
> UseCodeCacheFlushing (which is ON by default since 7u40). You may also have
> to increase CodeCacheFlushingMinimumFreeSpace (1500K default). We saw
> cases when there was no space left for compiler's temp buffers so we can't
> compile but left space was a little > 1500K so flushing was not triggered.
> There are other product flushing flags which you can play with.
>
> Note, flushing is triggered when no space left in whole reserved space. It
> is not triggered when you used up initial space.
>
> We will work on logging in jdk9 and 8u.
>
> Regards,
> Vladimir
>
>
> On 11/8/13 9:12 AM, Srinivas Ramakrishna wrote:
>
>> Thank you Albert, I appreciate it. I have one follow-up question below:-
>> (Note that my comments are modulo the fact that I have not played with
>> jdk 8 at all (althouygh i do have access to the source code for hotspot
>> trun, so i can just look at the code --
>> still this area is relatively unfamiliar to me in the low-level details
>> and now, following recent rewrites and fixes,
>> also in the high level structure)
>>
>>
>> On Fri, Nov 8, 2013 at 3:06 AM, Albert Noll <albert.noll at oracle.com<mailto:
>> albert.noll at oracle.com>> wrote:
>>
>> Hi Ramki,
>>
>> I have done the most recent changes to the code cache, so I might
>> help in answering your questions.
>> Please see comments inline.
>>
>>
>>
>>
>> On 11/08/2013 09:59 AM, Srinivas Ramakrishna wrote:
>>
>>> Some of this is slightly off-topic, but here goes ...
>>>
>>> I haven't looked at the code or the patch/webrev, but I would
>>> definitely vote for a "Print" flag for CodeCache
>>> state, analogous to PrintGCDetails for
>>> Java heap space state, which will periodically (say at each GC for
>>> lack of a better metronomic signal) print the
>>> size and occupancy of the code cache.
>>>
>>> The flag -XX:+PrintCodeCacheOnCompilation prints the status of the
>> code cache every time memory is allocated from
>> the code cache.
>>
>>
>> This is fine, but my concern is that this is at too fine a level of
>> granularity, at least in the initial phase or in a
>> phase change when
>> lots of methods are typically compiled. Why not also have a somewhat
>> asynchronous style of logging at some long period
>> of granularity
>> the size of the code cache. My guess (without looking at the code) is
>> that the occupancy and size can be determined at
>> very low
>> cost. (Hence the suggestion of a gc as a possible place to log such
>> information.)
>>
>>
>> I have separately made that request to this list in an earlier email
>>> this week, and would like to reiterate that
>>> request.
>>>
>>> In general, what is the advice for pre-jdk8 (i.e. 7uXX vintage) of
>>> JVM's wrt the setting of UseCodeCacheFlushing?
>>> Does having a suitably large code cache ensure that any of the
>>> myriad issues that seem to have been logged in
>>> JDK jira/bugzilla will not affect us, or is it generally recommended
>>> that we both increase the size of the code
>>> cache to a suitably safe value as well as switch off
>>> UseCodeCacheFlushing?
>>>
>>> In general, we recommend to use code cache flushing. The reason is
>> that if you do not use it and
>> you run out of code cache, the VM is not able to compile hot methods
>> bytecode to machine code
>> anymore. I.e., all methods that have not been compiled will run in
>> interpreted mode. If your application
>> runs out of code cache and you incur a performance regression due to
>> that, we recommend to
>> increase the code cache size (-XX:ReservedCodeCacheSize=).
>>
>>
>> Great; thanks for that advice. We have increased the reserved as well as
>> initial code cache size. Not having looked at the
>> code, one concern was that having a low initial size might (again without
>> looking at the code) cause a flush cycle which
>> might
>> extract a performance penalty, before the cache is expanded (as it works
>> its way to the max).
>>
>> One other concern was whether a sufficiently high occupancy of the code
>> cache will cause the flush code to kick in
>> and extract a performance drop. The strategy while we are unsure about
>> this state of affairs has been to try and
>> altogether avoid code cache flushing from kicking in by using a
>> sufficiently large code cache size.
>>
>> If the code cache occupancy stabilizes over a period of time, this might
>> seem to be a reasonable strategy until
>> we are able to move to jdk 8 where these issues are hopefully all fixed.
>>
>>
>> Finally, does anyone on this list know where one might find the
>>> hotspot sources for various jdk7uXX releases? I am
>>> interested
>>> in looking at many of them. Given the rapid changes in code cache
>>> flushing and related code in the last few
>>> months, I'd like to
>>> understand, from looking at the code and exercising it, as to what
>>> kinds of performance bugs we might be open to
>>> in this area
>>> in a specific 7uXX release. (Unless someone on this list already has
>>> a crisp description of that.) I'll verify my
>>> findings
>>> with folks on this list and share my findings once I have looked
>>> through specific 7uXX releases.
>>>
>>> You should be able to get the sources from http://openjdk.java.net/
>>
>>
>>
>> I was looking for pointers to the exact URL for the hsx 7uxx release
>> repos, but Jon has pointed me there (thanks!).
>>
>>
>> It is only in the last week or two that I became aware of these
>>> performance issues, and would like to make sure we
>>> are protecting ourselves sufficiently against it given a specific
>>> 7uXX release.
>>>
>>> With the release of Java 8, these issues should be fixed.
>>
>>
>>
>> Is there any chance that some of the more critical fixes might get
>> backported to a future 7uXX release?
>> If not, would it be worthwhile for someone from the community (and i am
>> happy to volunteer) to become involved in
>> backporting
>> them to a specific 7uXX release? I am happy to help with that, if there's
>> any chance that it would be entertained.
>> If such a backport is already on the cards, so much the better, and we
>> can wait for that to happen.
>>
>> thanks Albert, all!
>> -- ramki
>>
>>
>> thanks, and sorry again for hijacking the webrev discussion for
>>> this....
>>> -- ramki
>>>
>>> Best,
>> Albert
>>
>>
>>>
>>> On Thu, Nov 7, 2013 at 1:48 PM, Vladimir Kozlov <
>>> vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>>
>>>
>>> wrote:
>>>
>>> On 11/7/13 1:37 PM, Albert Noll wrote:
>>>
>>> Hi,
>>>
>>>
>>> On 11/07/2013 08:39 PM, Vladimir Kozlov wrote:
>>>
>>> On 11/7/13 11:04 AM, Igor Veresov wrote:
>>>
>>> I’d vote to put it under PrintCodeCache. And make
>>> the messages not
>>> warnings, but just “compiler disabled/enabled”. What
>>> do you think?
>>>
>>>
>>> Unfortunately there could be customer's tools which look
>>> for this
>>> message. So changing it, at least now for jdk8, is not
>>> good. With
>>> small codecache we will expect this message showing up.
>>> But with big
>>> codecache it should not happen. I think we should keep
>>> it as warning
>>> but throttle it when small codecache is used as Chris
>>> suggested.
>>>
>>> May be put it under combined check:
>>>
>>> if (PrintCodeCache || ReservedCodeCacheSize > X)
>>>
>>> Do we have a state now when we definitely will not
>>> compile any more?
>>> Or we always making progress? I think it will be
>>> difficult to find
>>> when it should be printed only once.
>>>
>>> With the current version (when sweeper is enabled) we should
>>> not reach a
>>> state (unless the entire code cache is filled with
>>> OSR-methods or native
>>> methods) where we disable compilation and never enable it.
>>> As soon as we free memory from the code cache, we re-enable
>>> compilation.
>>> The message will be printed very frequently, if the code
>>> cache is
>>> significantly smaller than the application demands.
>>>
>>> We could solve the 'problem' also by adding code that prints
>>> the warning
>>> only if compilation is
>>> disabled for a certain time. The current patch (webrev.01)
>>> defines a
>>> virtual time for the sweeper (we increment time counter by
>>> one every
>>> time we call mark_active_nmethods), which we could use.
>>>
>>>
>>> Or only print 10th (or whatever) message, first one must print.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>>
>>> Best,
>>> Albert
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>> igor
>>>
>>> On Nov 7, 2013, at 3:24 AM, Albert Noll <
>>> albert.noll at oracle.com <mailto:albert.noll at oracle.com>>
>>>
>>> wrote:
>>>
>>> Hi Chris,
>>>
>>> On 11/06/2013 03:18 AM, Chris Plummer wrote:
>>>
>>> BTW, one thing I forgot to mention is I now
>>> see a lot of messages
>>> for the codecache filling up. For example:
>>>
>>> Java HotSpot(TM) Client VM warning:
>>> CodeCache is full. Compiler has
>>> been disabled.
>>> Java HotSpot(TM) Client VM warning: Try
>>> increasing the code cache
>>> size using -XX:ReservedCodeCacheSize=
>>> CodeCache: size=2700Kb used=2196Kb
>>> max_used=2196Kb free=503Kb
>>>
>>> With b111, I was only seeing one message. I
>>> suspect with b111, once
>>> this message appeared compilation was never
>>> re-enabled so the
>>> message never appeared again. In that case
>>> seeing in many times now
>>> is actually a good indicator. However, it
>>> appears even when not
>>> using -XX:+PrintCodeCache, and I can see
>>> this output being a
>>> distraction for programs whose normal
>>> operation may involve
>>> constraining the codecache and having it
>>> constantly filling up.
>>> Perhaps this message should be off by
>>> default, or possibly only
>>> appear once.
>>>
>>> You are right. The previous version just never
>>> re-enabled
>>> compilation. I also agree that the
>>> output is distracting. There are multiple ways
>>> to solve this issue.
>>> I would go for a product -XX flag
>>> which allows to turn this warning on/off. Would
>>> that be ok or do you
>>> have a different solution in mind?
>>>
>>> Best,
>>> Albert
>>>
>>> cheers,
>>>
>>> Chris
>>>
>>> On 11/5/13 5:59 PM, Chris Plummer wrote:
>>>
>>> Hi Albert,
>>>
>>> I applied your patch and got some new
>>> numbers. Performance is now
>>> even better than it was with b110. See
>>> the chart I added to the bug.
>>>
>>> Nice work!
>>>
>>> Chris
>>>
>>> On 11/5/13 6:44 AM, Albert Noll wrote:
>>>
>>> Hi,
>>>
>>> could I get reviews for this small
>>> patch?
>>>
>>> bug: https://bugs.openjdk.java.net/
>>> browse/JDK-8027593
>>> webrev: http://cr.openjdk.java.net/~
>>> anoll/8027593/webrev.00/
>>> <http://cr.openjdk.java.net/%
>>> 7Eanoll/8027593/webrev.00/>
>>>
>>>
>>> Problem: The implementation of the
>>> sweeper (8020151) causes a
>>> performance regression for small
>>> code cache sizes. There are two
>>> issues that cause this regression:
>>> 1) NmethodSweepFraction is only
>>> adjusted according to the
>>> ReservedCodecacheSize if
>>> TieredCompilation is enabled. As a
>>> result, NmethodSweepFraction remains
>>> 16 (if TieredCompilation is
>>> not used). This is way too large for
>>> small code cache sizes
>>> (e.g., <5m).
>>> 2) _request_mark_phase (sweeper.cpp)
>>> is initialized to false. As
>>> a result, mark_active_nmethods() did
>>> not set _invocations and
>>> _current, which results in not
>>> invoking the sweeper (calling
>>> sweep_code_cache()) at all. When
>>> TieredCompilation is enabled
>>> this was not an issue, since
>>> NmethodSweeper::notify() (which sets
>>> _request_mark_phase) is called much
>>> more frequently.
>>>
>>> Solution: 1) Move setting of
>>> NmethodSweepFraction so that it is
>>> always executed.
>>> Solution: 2) Remove
>>> need_marking_phase(),
>>> request_nmethod_marking(), and
>>> reset_nmetod_marking().
>>> I think that
>>> these checks are not needed since
>>> 8020151, since we do stack scanning
>>> of
>>> active nmethods
>>> irrespective of the value of
>>> what need_marking_phase() returns.
>>> Since
>>> the patch removes
>>> need_marking_phase()
>>> printing out the warning (line 327 in
>>> sweeper.cpp) is
>>> incorrect, i.e., we continue
>>> to invoke the sweeper. I removed the
>>> warning
>>> and the
>>> associated code.
>>>
>>>
>>> Also, I think that we can either
>>> remove -XX:MethodFlushing or
>>> -XX:UseCodeCacheFlushing. Since
>>> 8020151, one of them is redundant
>>> and can be removed. I am not quite
>>> sure if we should do that now
>>> so it is not included in the patch.
>>>
>>> Testing
>>> bug: https://bugs.openjdk.java.net/
>>> browse/JDK-8027593 also shows
>>> a performance evaluation.
>>>
>>> Many thanks for looking at the patch.
>>> Best,
>>> Albert
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20131108/8d58eaae/attachment-0001.html
More information about the hotspot-compiler-dev
mailing list