RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd

Thu Jan 3 11:13:55 UTC 2019

Hello David and Dan ,  here is a second webrev :

http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/

- adjusted copyright years + fixed some typos
- added  the missing return for FreeBSD  (pointed out by  Dan)
- removed  the  warning message  completely

Best regards, Matthias

> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Donnerstag, 3. Januar 2019 08:53
> To: daniel.daugherty at oracle.com; Baesken, Matthias
> <matthias.baesken at sap.com>; 'hotspot-dev at openjdk.java.net' <hotspot-
> dev at openjdk.java.net>
> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root
> users on linux/bsd
> 
> On 3/01/2019 9:36 am, Daniel D. Daugherty wrote:
> > On 1/2/19 5:35 PM, David Holmes wrote:
> >> Hi Dan,
> >>
> >> On 3/01/2019 2:50 am, Daniel D. Daugherty wrote:
> >>> On 1/2/19 4:11 AM, Baesken, Matthias wrote:
> >>>> Hello , please  review the following patch .
> >>>>
> >>>> Currently, when ThreadPriorityPolicy is set to 1 (so called
> >>>> "Aggressive mode"), on linux and bsd(+Mac) a root-user-check
> >>>> (geteuid() != 0)) is done.
> >>>> See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp
> >>>> int prio_init().
> >>>>
> >>>> However the root-user-check has a few drawbacks:
> >>>> - it blocks the capabilities feature available on current Linux
> >>>> distros (CAP_SYS_NICE capability) that can be used to allow setting
> >>>> lower niceness also for non-root
> >>>> - setting a higher "niceness" (lower priority) is not possible on
> >>>> Linux for non-root because of the geteuid check
> >>>>
> >>>> We  had a discussion about this in "ThreadPriorityPolicy settings
> >>>> for non-root users"  ,  with this suggestion :
> >>>>
> >>>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-
> December/035986.html
> >>>>
> >>>>
> >>>> ....
> >>>>
> >>>>> Just drop the root check for ThreadPriorityPolicy=1 and let the
> >>>>> underlying system
> >>>>> permissions control success or failure.
> >>>> I  did  the change  in this  webrev :
> >>>>
> >>>>
> >>>> Bug/webrev :
> >>>>
> >>>> https://bugs.openjdk.java.net/browse/JDK-8215962
> >>>>
> >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/
> >>>
> >>> General
> >>>    - Please update Copyright years to 2019 before pushing.
> >>>
> >>> src/hotspot/share/runtime/globals.hpp
> >>>      L2045:           "    Linux/BSD/Mac this policy requires root
> >>> privilege or        "\
> >>>          Typo: "Mac" should be "macOS".
> >>>
> >>>      L2046:           "    extended
> >>> capabilites.")                                      \
> >>>          Typo: "capabilites" -> "capabilities."
> >>>
> >>>          Please consider: "an extended capability." since it is normal
> >>>          for only a single capability to be required for a specific
> >>>          policy override.
> >>>
> >>> src/hotspot/os/bsd/os_bsd.cpp
> >>>      L2260: // CAP_SYS_NICE capabilities
> >>>          Typo: "capabilities" -> "capability." (note the added period)
> >>>
> >>>      L2306:   if (ThreadPriorityPolicy == 1) {
> >>>      L2307:     // root and threads with capability CAP_SYS_NICE can
> >>> raise the thread priority
> >>>      L2308:     // however testing the CAP_SYS_NICE capability would
> >>> require libcap.so
> >>>      L2309:     if (geteuid() != 0) {
> >>>      L2310:       if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) {
> >>>      L2311:         warning("-XX:ThreadPriorityPolicy requires root
> >>> privilege or CAP_SYS_NICE capability on Bsd");
> >>>      L2312:       }
> >>>      L2313:     }
> >>>      L2314:   }
> >>>          Sorry this whole block is the wrong thing to do. It makes the
> >>>          assumption that it "knows" the underlying security policy and
> >>>          tries to provide a "helpful" message in anticipation that the
> >>>          underlying security policy will reject the operation.
> >>>
> >>>          The "(ThreadPriorityPolicy == 1) {" if-block needs to be
> >>> deleted.
> >>>
> >>>      If you want to output a helpful warning, then you need to do it in
> >>>      the code that will actually get a policy failure:
> >>>
> >>>      static void do_set_native_prio_warning() {
> >>>          static bool has_warned = false;
> >>>          if (!has_warned) {
> >>>            warning("-XX:ThreadPriorityPolicy requires root privilege
> >>> or the CAP_SYS_NICE capability.");
> >>>            has_warned = true;
> >>>          }
> >>>      }
> >>
> >> Sorry I disagree. The existing code checks for policy and whether root
> >> and issues a warning then resets policy.
> >
> > You have to be careful with the word 'policy' here. You are actually
> > talking about the 'ThreadPriorityPolicy' option here and not the
> > security policy associated with the setpriority() call. One thing
> > the application code cannot do here is reset the underlying
> > security policy.
> 
> I'm only talking about ThreadPriorityPolicy.
> 
> >> The new code does exactly the same thing except it doesn't reset the
> >> policy. The placement of the warning was fine before so it is fine now.
> >
> > Actually I disagree that the placement of the warning was fine before.
> 
> Fine - but it seemed a little unfair to force a re-working of the
> overall approach taken by this code for many years just to effectively
> delete one line that resets the ThreadPriorityPolicy value.
> 
> But see below ...
> 
> 
> > As I said, in my original review comment:
> >
> >>          Sorry this whole block is the wrong thing to do. It makes the
> >>          assumption that it "knows" the underlying security policy and
> >>          tries to provide a "helpful" message in anticipation that the
> >>          underlying security policy will reject the operation.
> >
> > The application code should not make the assumption that it knows the
> > underlying security policy of the setpriority() call. That is a basic
> > principle of Trusted Systems design. The best you can do is try the
> > operation and if it fails, then try to issue a possibly helpful message
> > based on the error that is returned.
> 
> Sure and if this was being designed now rather than just being tweaked
> then a lot of things would be different.
> 
> > There are some Trusted Systems
> > folks that don't believe in trying to interpret errno values either.
> > Why? Because you can only code the errno values that you know today
> that
> > are security policy related. You can't know if some system down the road
> > will add a new errno value that's security related...
> 
> Not to get side-tracked but that's why you use standards that don't keep
> adding new error values - there's a fundamental incompatibility if you
> add new values that effectively overlap with the meaning of existing ones.
> 
> > My proposed do_set_native_prio_warning() function should actually take
> > an errno parameter and it should only issue the warning if the errno is
> > EACCES or EPERM.
> 
> Agreed.
> 
> > Another problem with the new code is that:
> >
> >    warning("-XX:ThreadPriorityPolicy requires root privilege or the
> > CAP_SYS_NICE capability.");
> >
> > will be issued when the user != root and the thread has the
> > CAP_SYS_NICE capability so we'll be issuing a warning even
> > though the setpriority() call should succeed. I don't think
> > a false warning is acceptable.
> 
> It's not nice but I don't think it warrants being completely unacceptable.
> 
> But I'll also note that this is even more complex than outlined because
> it's not just affected by being root or not, nor by CAP_SYS_NICE but
> also by the setting of RLIMIT_RTPRIO (on Linux).
> 
> > By moving the warning to where setpriority() has failed, we no longer
> > would have the problem of a false warning.
> >
> >> Yes it could go on use (and add a only-warn-once hack) but why force
> >> such a disruptive change that has no benefit?
> >
> > As I've pointed out above, I do think it has benefit and it meets
> > a Trusted System design principle. As for the mechanism to
> > only-warn-once, I'm sorry you consider it a hack. I consider it
> > to be a useful way to avoid swamping the warning output with the
> > same message. I know HotSpot does the same thing in other places
> > so if it is a hack, then it is in good company. :-)
> 
> Yes but it's typically done when there isn't a single place where we
> otherwise issue the warning. Ideally you detect capabilities upfront (ie
> when seeing ThreadPriorityPolicy has been set) and issue a warning then
> if warranted. But as noted we don't have a way to detect the appropriate
> capabilities without adding even more code.
> 
> That said a single warning may not be appropriate here anyway as whether
> or not the change to priority fails is not just a function of the
> "permissions" but also whether the priority is being raised or lowered.
> Maybe this shouldn't be a warning at all, but materialize as a
> Java-level exception?
> 
> See more below ...
> 
> >>>      L2321: OSReturn os::set_native_priority(Thread* thread, int
> >>> newpri) {
> >>>      L2322:   if (!UseThreadPriorities || ThreadPriorityPolicy == 0)
> >>> return OS_OK;
> >>>      L2323:
> >>>      L2324: #ifdef __OpenBSD__
> >>>      L2325:   // OpenBSD pthread_setprio starves low priority threads
> >>>      L2326:   return OS_OK;
> >>>      L2327: #elif defined(__FreeBSD__)
> >>>      L2328:   int ret =
> >>> pthread_setprio(thread->osthread()->pthread_id(), newpri);
> >>> // Note the __FreeBSD__ branch here is broken; it is missing the
> >>> return sequence.
> 
> Hmmm - okay this code is even more broken than I thought. I did warn
> Matthias that there is a good change these code paths may never have
> been used - seems this one hasn't even been built with a decent compiler
> (that would complain about the missing return).
> 
> This does make me question how the calling code for this will actually
> handle the OS_ERR return? Will we see a nice Java exception thrown from
> Thread.start? Or will some other part of the VM code abort when seeing
> the error? (That would render the warning somewhat moot.)
> 
> More below ...
> 
> >>>               if (ret != 0) {
> >>>                 do_set_native_prio_warning();
> >>>                 return OS_ERR;
> >>>               }
> >>>               return OS_OK;
> >>>      L2329: #elif defined(__APPLE__) || defined(__NetBSD__)
> >>>      L2330:   struct sched_param sp;
> >>>      L2331:   int policy;
> >>>      L2332:
> >>>      L2333:   if
> >>> (pthread_getschedparam(thread->osthread()->pthread_id(), &policy,
> >>> &sp) != 0) {
> >>>                 do_set_native_prio_warning();
> >>>      L2334:     return OS_ERR;
> >>>      L2335:   }
> >>>      L2336:
> >>>      L2337:   sp.sched_priority = newpri;
> >>>      L2338:   if
> >>> (pthread_setschedparam(thread->osthread()->pthread_id(), policy,
> &sp)
> >>> != 0) {
> >>>                 do_set_native_prio_warning();
> >>>      L2339:     return OS_ERR;
> >>>      L2340:   }
> >>>      L2341:
> >>>      L2342:   return OS_OK;
> >>>      L2343: #else
> >>>      L2344:   int ret = setpriority(PRIO_PROCESS,
> >>> thread->osthread()->thread_id(), newpri);
> >>>               if (ret != 0) {
> >>>                 do_set_native_prio_warning();
> >>>                 return OS_ERR;
> >>>               }
> >>>               return OS_OK;  // replace L2345 with this line.
> >>>      L2345:   return (ret == 0) ? OS_OK : OS_ERR;
> >>>      L2346: #endif
> >>>      L2347: }
> >>>
> >>>
> >>> src/hotspot/os/linux/os_linux.cpp
> >>>      L4080: // CAP_SYS_NICE capabilities
> >>>          Typo: "capabilities" -> "capability." (note the added period)
> >>>
> >>>      The same comment about prio_init() applies here:
> >>>
> >>>      L4103-L4111:
> >>>          The "(ThreadPriorityPolicy == 1) {" if-block needs to be
> >>> deleted.
> >>>
> >>>      static void do_set_native_prio_warning() {
> >>>          static bool has_warned = false;
> >>>          if (!has_warned) {
> >>>            warning("-XX:ThreadPriorityPolicy requires root privilege
> >>> or the CAP_SYS_NICE capability.");
> 
> The warning text doesn't cover all the possibilities and I don't think
> it should. Something more generic like:
> 
> "-XX:ThreadPriorityPolicy=1 is affected by underlying system permissions
> and may trigger errors if priority is changed in ways that are not allowed"
> 
> or as I said maybe this shouldn't be a warning at all ...
> 
> Cheers,
> David
> ------
> 
> >>>            has_warned = true;
> >>>          }
> >>>      }
> >>>
> >>>      But the changes to os::set_native_priority() are much simpler:
> >>>
> >>>      L4118: OSReturn os::set_native_priority(Thread* thread, int
> >>> newpri) {
> >>>      L4119:   if (!UseThreadPriorities || ThreadPriorityPolicy == 0)
> >>> return OS_OK;
> >>>      L4120:
> >>>      L4121:   int ret = setpriority(PRIO_PROCESS,
> >>> thread->osthread()->thread_id(), newpri);
> >>>               if (ret != 0) {
> >>>                 do_set_native_prio_warning();
> >>>                 return OS_ERR;
> >>>               }
> >>>               return OS_OK;  // replace L4122 with this line.
> >>>      L4122:   return (ret == 0) ? OS_OK : OS_ERR;
> >>>      L4123: }
> >>>
> >>> In both os/bsd/os_bsd.cpp and os/linux/os_linux.cpp, the
> >>> os::get_native_priority() code allows for the possibility
> >>> of getting an error condition for getpriority(). I don't
> >>> think we need a do_get_native_prio_warning() function here
> >>> since the only threads we should be querying belong to the
> >>> Java process so they should not fail the policy check.
> >>>
> >>> Dan
> >>>
> >>>
> >>>>
> >>>>
> >>>> Best Regards ,  Matthias
> >>>
> >