Pls review 7091418: FX priority class from Solaris should be available to JVM )

Tue Jan 24 11:55:08 PST 2012

Thanks for the review.

Inline...

On 1/23/12 11:53 PM, David Holmes wrote:
> Hi Paul,
>
> Warning: long winded response below

Looking forward to it. :)

>
> Summary:
> - file RFE to fully support "system" priority for VM internal threads
> - Fully implement UseCriticalJavaThreadPriority on non-Solaris
>
> You'll have to read on to understand what I mean by both of these :)
>
> On 24/01/2012 8:00 AM, Paul Hohensee wrote:
>> I just found
>>
>> 7082553: Interpret Thread.setPriority(Thread.MAX_PRIORITY) to mean FX60
>> on Solaris 10 and 11
>>
>> which seems like the CR I should actually be using.
>>
>> Ref your last comment on 7082553, we could indeed change the definition
>> of JavaPriority10_To_OSPriority, but that would cover only Java threads,
>> not compiler or the CMS background threads.
>
> Thanks for reminding me about this CR - I knew there had been further 
> discussion on this. That CR is not public, which seems a little 
> pointless now.

Yes.  I've made 7082553 public and make 7091418 a dup of 7082553.

>
> I'll restate my concerns as outlined in that CR. It is very difficult 
> for an application programmer to determine that FX60 will provide any 
> performance benefit, whether assigned to a compiler thread, GC thread 
> or Java thread. The benefits of FX60 only exist when a system has idle 
> cores, so the application programmer and/or the application "deployer" 
> has to know what overall load will exist on the system. For the GC and 
> compiler threads this is not such an issue as the person deploying the 
> app can simply experiment with the flag being on and off and see it it 
> helps in their situation.

Undoubtedly true that only experts should be using it.  There are 
significant
Oracle applications running on T4 that contain serial bottlenecks, so by 
definition
there are idle cores available.

>
> For Java threads however the application programmer has to use 
> Thread.MAX_PRIORITY. This has two negative consequences. First it 
> encourages people to start thinking about Thread priorities when we 
> have spent a dozen years telling them to forget about them because in 
> 99% of the cases they have no meaning. For them to have meaning you 
> have to select non default ThreadPriorityPolicy settings and use the 
> right OS - even then you get different affects on different OSes. 
> Which leads me to the second consequence: a loss of portability. Using 
> Thread.MAX_PRIORITY can have different consequences on different 
> platforms, even today.

That's true.  Imo, the correct solution is to invent a QOS API at a 
higher abstraction
level, e.g., something like Thread.prefer(SYSTEM_THROUGHPUT) or
Thread.prefer(FAST_RESPONSE_TIME).  We wanted to get something useful sooner
than that though, hence this RFE.  Note that it's very much 
experimental, which
is why it's use is gated by -XX:+UseExperimentalVMOptions.  There's no
guarantee at all that we'll continue to support these switches.

>
> Ok - so the response to the above is "These are experimental options 
> which must be unlocked, and its for advanced users and 'caveat 
> emptor'". I can live with that, so lets look at the details of the 
> implementation.

Ah, didn't need to write the above. :)

>
> There are really two distinct parts to this. First the ability to 
> define a special system thread priority. In that regard I like the 
> idea of creating a pseudo-priority of 11 and using that to map to this 
> system, aka critical, priority (though this could be better 
> documented). You then have flags the say "use the system priority for 
> threads of type X". This is cross-platform. There are two things I 
> find problematic with this though:
>
> 1. As priority is simply a number we can't directly use this to 
> implement FX60. Instead we have to put in a hack where a negative 
> value on Solaris tells the system specific code "that means FX 60".

Yes.  I could make things a little better by having a negative number mean
"FX" and to use the positive equivalent directly (i.e., not scale it).  
An alternative
would be to add a second mapping array (or make an array of structs) that
contains the desired scheduling class.  This seemed like overkill to me
given the limited goal of this change.

> 2. There is no mechanism, apart from recompiling the VM, to actually 
> control what this system/critical priority is.

True.

>
> So this general purpose mechanism falls somewhat short of being 
> generally useful as a cross-platform mechanism. I find both of these 
> unsatisfactory and a RFE should be filed to address them.

It's intended to be Solaris-only right now, which is why I didn't 
implement the
full-blown solution on non-Solaris platforms.  Rather I made it 
incidental to
an understandable Solaris implementation, esp. the common code parts
in compileBroker.cpp and concurrentMarkSweepThread.cpp.

I'll file an RFE along the lines of being able to specify a scheduling class
and priority for each Java thread priority.  Either extend 
JavaPriority<n>_To_OSPriority
to take a CSV list argument (which is coming to the command line parser at
some point anyway), or add 10 more switches to specify the scheduling class.

>
> The second part is the "map Thread.MAX_PRIORITY to the critical 
> priority" part. On Solaris you use UseCriticalJavaThreadPriority to 
> actually update the priority mapping 10 -> critical. But you don't do 
> this on the other platforms. This is what I object to - if you are 
> going to make this a cross-platform setting then it should be 
> implemented fully on all platforms. It doesn't matter that by default 
> the end result is the same, the code should be there so that if you 
> could define the critical priority at runtime things would work as 
> expected.

I didn't do it on non-Solaris platforms (at least not in the code) 
because on those
platforms CriticalPriority is the same as MaxPriority.  No runtime 
mapping needed.
For compiler threads and the CMS background thread on non-Solaris platforms,
using critical priority will up their priority, which is new.

I don't know what other equivalent (i.e., scheduling classes) to use on 
non-Solaris
platforms.  Can you recommend something so I can put it in an RFE?  I 
hesitate to
do more than I have because we're at the end of the development cycle 
for 7u4/hs23.

Paul

>
> Cheers,
> David
> -----
>
>>
>> Paul
>>
>> On 1/23/12 2:11 PM, Paul Hohensee wrote:
>>> Thanks for the review.
>>>
>>> Inline...
>>>
>>> On 1/22/12 7:39 PM, David Holmes wrote:
>>>> Hi Paul,
>>>>
>>>> The meta-comment here is that there needs to be a clear description
>>>> of what "critical priority" means and what constraints there are on
>>>> setting it to some OS specific value. For example the current changes
>>>> uses the FX scheduling class, but what if someone used the RT
>>>> scheduling class instead? Would that work? Probably not, in which
>>>> case we should document that this selection of the "critical
>>>> priority" is not an arbitrary choice that can be made.
>>>>
>>>> Even for FX/60 I'm not certain that using this for Java threads might
>>>> not prevent safepoints from being reached or induce some other form
>>>> of livelock.
>>>
>>> I added material to the Comments field of the CR.
>>>
>>> I don't think there's a livelock problem with Java threads, because
>>> Solaris takes
>>> FX60 as advisory, not as a command. All that should happen is that a
>>> critical
>>> priority Java thread will get to the safepoint earlier than
>>> non-critical ones.
>>> I suppose it's possible for critical priority CMS or compiler threads
>>> to starve
>>> non-critical Java threads, but they run at NearMaxPriority by default
>>> now,
>>> which can do the same thing. This is definitely an "expert-only" 
>>> feature
>>> though, which is why it's experimental for the time being.
>>>>
>>>> On 21/01/2012 3:13 AM, Paul Hohensee wrote:
>>>>> Webrev here
>>>>>
>>>>> http://cr.openjdk.java.net/~phh/7091418.00/
>>>>>
>>>>> This change defines a new Java pseudo-priority called
>>>>> CriticalPriority, just above MaxPriority. Compiler threads, the CMS
>>>>> background thread, and Java threads can have the os equivalent of
>>>>> this priority. On Solaris, this is the FX/60 scheduling
>>>>> class/priority. On other platforms, it's the same as MaxPriority's os
>>>>> priority.
>>>>
>>>> For reference this is why the mapping to FX/60 has been proposed:
>>>>
>>>> http://blogs.oracle.com/observatory/entry/critical_threads_optimization 
>>>>
>>>>
>>>> I still don't fully grok what this optimization does in a general
>>>> sense and it seems to be geared to providing better single-threaded
>>>> performance on near-idle systems - which doesn't make any sense to me
>>>> in a JVM context. But FX/60 also gives you true priority over TS/IA
>>>> threads so that may be where the gain comes from. I wonder if any
>>>> experiments were actually done using FX/59 rather than the "magical"
>>>> FX/60?
>>>
>>> It's meant to be Solaris-Sparc-specific, but it was easier to
>>> implement as a
>>> general feature than to specialize it. Given enough cores, FX60 does
>>> indeed
>>> give you true priority over TS/IA threads. If there aren't enough cores
>>> to run both critical threads in single-thread mode and non-critical
>>> threads
>>> at the same time, Solaris will allow non-critical threads to run on the
>>> same core(s) as critical ones.
>>>
>>> I don't know of any FX59 experiments, but given the amount of work
>>> it's taken
>>> for the Solaris folks to get FX60 working, I doubt using it would have
>>> any positive
>>> effect.
>>>>
>>>>> There are 3 new command line switches, all gated by
>>>>> UseExperimentalVMOptions.
>>>>>
>>>>> -XX:+UseCriticalJavaThreadPriority
>>>>>
>>>>> Maps Java MAX_PRIORITY to critical priority.
>>>>
>>>> I found what you have done here to be very confusing. The only place
>>>> UseCriticalJavaThreadPriority is used is on Solaris. There you re-map
>>>> the priority mapping for priority 10 to the "critical priority" as
>>>> described.
>>>
>>> It's actually used on the other OSs. It just maps to MaxPriority on
>>> those.
>>>
>>>>
>>>> On all platforms you added an entry to the priority mapping table(s)
>>>> for a non-existent Java priority 11. This provides a way to lookup
>>>> the "critical priority" for the CMS/Compiler threads - in essence use
>>>> of critical priority for those threads says "pretend these have Java
>>>> priority 11" and then you've added a mapping for a priority 11 that
>>>> is the same as for priority 10 except on Solaris. On Solaris you had
>>>> to use a sentinel value to say "this really means use the "critical
>>>> priority" because there is no way to convey a change of scheduling
>>>> class.
>>>>
>>>> It seems to me that we are pretending to have "critical priority"
>>>> support on all platforms when in reality we don't. If we want to go
>>>> that way then we should extend it to the
>>>> UseCriticalJavaThreadPriority case as well. It should be all or 
>>>> nothing.
>>>
>>> Extend it beyond making CriticalPriority == MaxPriority on non-Solaris
>>> platforms?
>>> I.e., we can now change the compiler and CMS thread priority to
>>> MaxPriority on
>>> non-Solaris platforms. I don't know how to make CriticalPriority
>>> higher than that
>>> on non-Solaris platforms.
>>>
>>>>
>>>> Further it needs to be made clear that these may still be dependent
>>>> on the value of ThreadPriorityPolicy.
>>>
>>> I added a comment to the CR to that effect.
>>>
>>>>
>>>>> -XX:+UseCriticalCompilerThreadPriority
>>>>>
>>>>> All compiler threads run at critical priority.
>>>>
>>>> It should be more clear that UseCriticalCompilerThreadPriority only
>>>> applies if CompilerThreadPriority is not set. Perhaps there should
>>>> also be a startup check for both being used?
>>>
>>> I could, but making CompilerThreadPriority rule is what I intended. 
>>> I'll
>>> add a comment to globals.hpp and the CR.
>>>
>>>>
>>>> Thinking more though we really shouldn't need both flags. The basic
>>>> problem is that the current "api" only supports setting a simple
>>>> number and to use FX/60 also requires a change of scheduling class.
>>>> You could add a hack that CompilerThreadPriority=60 means FX/60. Or,
>>>> as I've suggested in past email we could generalize the format of the
>>>> option to allow both a scheduling class designator and priority to be
>>>> passed - that would be a more general mechanism.
>>>
>>> I didn't want to remove CompilerThreadPriority or change it's effect.
>>> I can file a CR
>>> to do that though. Current uses of CompilerThreadPriority=60 should
>>> work like
>>> they always have.
>>>
>>> I wanted to confine the change as much as possible to Solaris _and_ to
>>> limit it
>>> to just scheduling classes where we know we're not likely to provoke
>>> thread
>>> starvation. I can file a CR to add the ability to specify a scheduling
>>> class for
>>> Java threads. It would probably add 10 switches for scheduling class
>>> corresponding
>>> to the existing 10 Java priority switches. I don't have any ideas on
>>> how to
>>> designate particular threads for particular class/priorities.
>>>
>>>>
>>>> Adding a psuedo-priority 11 is just means to work within the current
>>>> limitations of the priority scheme.
>>>
>>> Correct.
>>>
>>>>
>>>>> -XX:+UseCriticalCMSThreadPriority
>>>>>
>>>>> The CMS background thread runs at critical priority.
>>>>
>>>> This doesn't make a lot of sense when you consider the comments in
>>>>
>>>> src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepThread.cpp 
>>>>
>>>>
>>>>
>>>> which still states:
>>>>
>>>> "Priority should be just less than that of VMThread"
>>>>
>>>> This seems to indicate that we don't really understand what the
>>>> priority relationship between GC threads and the VMThread should be.
>>>
>>> No, we don't. That's why this is experimental.
>>>
>>>>
>>>> Should we be able to run the VMThread at FX/60?
>>>
>>> Perhaps. It only matters for things like serial gc, which isn't used
>>> on big iron.
>>>
>>>>
>>>>> On Solaris, one must in addition use -XX:+UseThreadPriorities to use
>>>>> native
>>>>> priorities at all. Otherwise, Hotspot just accepts whatever Solaris
>>>>> decides.
>>>>
>>>> Is it also dependent on the value of ThreadPriorityPolicy? Should it
>>>> be? Does it make sense to use it with either policy value?
>>>
>>> No, it's not dependent on ThreadPriorityPolicy. Critical priority is
>>> the same
>>> no matter what the default MaxPriority java_to_os_priority is. I think
>>> that's
>>> the right thing to do.
>>>
>>>>
>>>>>
>>>>> Before this change, the Solaris implementation could only change
>>>>> priorities
>>>>> within the process scheduling class. It didn't change scheduling
>>>>> classes on
>>>>> a per-thread basis. I added that capability and used it for the
>>>>> critical
>>>>> thread
>>>>> work. I also fixed a bug where we were using thr_setprio() to save 
>>>>> the
>>>>> original native priority during thread creation and reading it back
>>>>> when
>>>>> the thread started via thr_getprio(). Since thr_setprio() can change
>>>>> the
>>>>> user-supplied priority, this resulted in an unintended (lower) 
>>>>> priority
>>>>> being used.
>>>>
>>>> I don't quite follow this. We used thr_setprio to set the native OS
>>>> priority, and we then read it back using thr_getprio and then used
>>>> that to pass to thr_setprio again (and also set_lwp_priority). If
>>>> thr_setprio can change the user-supplied priority then it can make
>>>> that change on the second call too can't it? Does the fact we now
>>>> have a lwp affect this? I'm curious about the fact we still both use
>>>> thr_setprio and set the LWP priority directly ???
>>>
>>> Possibly someone like Dave Dice can answer that question. We were
>>> already using
>>> both thr_setprio and set_lwp_priority together. Likely that was in
>>> case set_lwp_priority
>>> wasn't available.
>>>
>>> thr_setprio takes a value between 0 and 127 and map that to "some
>>> priority" that
>>> may not be the same as its argument. You can, for example, pass it 127
>>> and
>>> get 60 back from thr_getprio. So if we set it once with 127 and then
>>> set it again
>>> with 60, we can ultimately get back 0. Which is what actually used to
>>> happen.
>>>
>>> Paul
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>>> Thanks,
>>>>>
>>>>> Paul
>>>>>
>>>