RFR(s): 8148425: strerror() function is not thread-safe

David Holmes david.holmes at oracle.com
Wed Mar 16 09:40:03 UTC 2016


Change pushed but unfortunately I hadn't noticed the bug report had been 
targetted to 10 so a 9 "backport" was created. Bah! I think hgupdater 
needs to be a little smarter about that.

David

On 16/03/2016 2:02 AM, Thomas Stüfe wrote:
> Coleen, David,
>
> thank you both!
>
> I will do a follow up change later when there is more time.
>
> Kind regards, Thomas
>
>
>
> On Tue, Mar 15, 2016 at 4:02 PM, Coleen Phillimore
> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>> wrote:
>
>
>     This looks fine to me, as-is.
>
>     On 3/15/16 8:31 AM, David Holmes wrote:
>
>         Hi Thomas,
>
>         On 15/03/2016 8:07 PM, Thomas Stüfe wrote:
>
>             Hi David,
>
>             sorry, I will have not the time to work on this this week.
>             If the patch
>             is acceptable to you in this form, I'd say we commit it,
>             otherwise I'll
>             take it up again some time later.
>
>
>         If Coleen is okay with this as-is I'll commit it in the  morning.
>
>             About the suggested changes:
>
>             - Merging both APIs into one, like suggested, is fine for me
>             and I would
>             do this in a follow-up.
>
>
>         Sure.
>
>             - Handling unknown errno values: I do not like the idea of a
>             fallback to
>             ::strerror(). Writing the API comment on that one would be
>             weird: "Is
>             almost threadsafe", "will probably return a static constant
>             string"...
>             :) I would suggest either using the solution as it is now - only
>             honouring POSIX errno values or returning "Unknown", or to
>             expand them
>             with OS specific lists, either in the OS independent form
>             ("#ifdef EBLUB
>             handle(EBLUB) #endif") or by expanding os::strerror() with
>             platform
>             dependend os::strerror_pd().
>
>             I am pretty unemotional about either option but would prefer
>             avoiding
>             ::strerror() altogether.
>
>
>         A follow up on this okay. I'm unsure how best to handle it.
>         Given how rarely we encounter errors I'm probably over-thinking
>         it. ;)
>
>
>     Yeah, I think you're over-thinking it.  The two APIs with comments
>     make sense to me, and something returning Unknown isn't a big deal
>     in real life.
>
>     Thanks for the change and thanks for sponsoring this, David.
>
>     Coleen
>
>
>         Thanks,
>         David
>
>             Kind Regards, Thomas
>
>
>
>             On Mon, Mar 14, 2016 at 4:43 AM, David Holmes
>             <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>             <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>> wrote:
>
>                  On 12/03/2016 6:22 PM, Thomas Stüfe wrote:
>
>                      Hi David,
>
>                      thanks again for your review. Here the new version:
>
>             http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.02/webrev/
>
>                      And the diff to the last version:
>
>             http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.01-to-02.txt
>
>                      I fixed the indention, copyright comments and the typo.
>
>
>                  All looks good - thanks.
>
>                      Further comments inline.
>
>
>                  Ditto :)
>
>
>                      On Fri, Mar 11, 2016 at 8:04 AM, David Holmes
>                      <david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>
>             <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>>> wrote:
>
>                           Hi Thomas,
>
>                           Thanks for persevering with this. Sorry for
>             the delayed
>                      response.
>
>                           On 10/03/2016 2:58 AM, Thomas Stüfe wrote:
>
>                               Hi all,
>
>                               please take a look at this next version of
>             the fix.
>
>             http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.01/webrev/
>
>                               I am the first to admit that this is quite
>             ugly :(, but
>                      it is a
>                               pragmatic solution.
>
>                               Some details:
>
>                               - I added os::strerror() as a drop-in
>             replacement for
>                               ::strerror() - it
>                               returns the text as before, but does not
>             localize and
>                      does not
>                               handle
>                               unknown error values; so it can be
>             implemented using a
>                      simple static
>                               string table and is always safe to use.
>
>
>                           Ok. But I don't think we need os::lasterror()
>             if we have
>                      this. A
>                           follow up clean-up is okay for that though.
>
>
>                      I would love to keep this functionality, because I
>             like printing
>                      out the
>                      short concise errno values, and there are a number
>             of logging
>                      changes
>                      coming at least for AIX. It also does not cost
>             much, as the
>                      large string
>                      table is already there.
>
>                      If the second API does not suit you, would it be
>             accepable for
>                      you if I
>                      hide this functionality behind a second parameter
>             in strerror,
>                      defaulting to false, like I did in my first
>             version? E.g.
>                      os::strerror(int e, bool short_name=false); ?
>
>
>                  That was what I was thinking of yes. :)
>
>                      But talking about clean-up, could we get rid of
>             os::lasterror() ? I
>                      think that function is a bad idea, and it
>             fortunately is only
>                      called in
>                      two places, both just need strerror() and no Win32
>             error
>                      handling anyway.
>
>
>                  Sounds good. lasterror() is a fragile API as it is too
>             easy to
>                  introduce unexpected system calls between the call to
>             lasterror and
>                  the action you want to check for the error.
>
>                      I also would like to remove os::get_last_error().
>             It returns
>                      errno on
>                      *nix, win32 error on Windows. Both are different
>             things, plus
>                      there is
>                      also an errno on Windows The return value of this
>             function is
>                      useless
>                      without knowing what the returned error number
>             actually is, so
>                      it is not
>                      of much use.
>
>
>                  Simplifying and streamlining all this sounds good.
>
>                           Also it does seem to handle unknown error values.
>
>                               - I added os::errno_name() for those cases
>             where one
>                      would want the
>                               literalized errno value (e.g. "EINVAL")
>             instead of the
>                      full text.
>
>
>                           Ok. For the unknown case can it return the
>             numeric value
>                      that it was
>                           passed?
>
>
>                      I would like to avoid this. This would make the
>             function more
>                      complicated.
>
>                      I'd need to keep this thread-safe, so I either
>             would have to
>                      allocate
>                      thread local storage for the return string somehow or,
>                      alternatively,
>                      keep a fixed string table a la [][] {"unknown (0)",
>                      "unknown(1)"} for
>                      the first n values or so.
>
>
>                  Okay. I was thinking this would be simpler but of
>             course we have to
>                  convert to a string.
>
>                  I note that linux defines ~133 errno values, and you
>             have entries
>                  for ~75 - are they all the POSIX ones?
>
>                  I'm wondering whether we should have a fallback
>             strategy for an
>                  unknown (platform-specific) error by using the platform
>             strerror
>                  even though not thread-safe (the window of exposure is
>             vastly reduced) ?
>
>                      Also, ::strerror() behaves not consistently for the
>             EINVAL case
>                      on all
>                      platforms anyway - on Linux and Bsd, it returns the
>             numerical
>                      value, but
>                      on Solaris, Windows and AIX, it just returns
>             "Unknown". Many
>                      callsites I
>                      saw take this already into account and print out
>             the numerical errno
>                      value in addition to the error string.
>
>
>                  I doubt those callsites are not taking into account
>             unknown values -
>                  I think you will find they originally only reported the
>             numerical
>                  value and were then augmented to also print the
>             strerror message.
>
>                  Thanks,
>                  David
>
>                               - I fixed all callsites in the hotspot
>             which used
>                      ::strerror to use
>                               os::strerror, or, in some few cases
>             (mostly my own
>                      coding for AIX),
>                               os::errno_name().
>
>
>                           Bit subjective but okay - I personally don't
>             care either
>                      way. :)
>
>                               - To avoid including os.hpp in debug.hpp,
>             I moved the
>                               implemntation of
>                               assert_status to debug.cpp
>
>
>                           Okay ... did the include cause a problem?
>
>
>                      os.hpp is a large beast. I wanted to keep it out of
>             debug.hpp,
>                      to not
>                      force this include upon anyone who includes
>             debug.hpp just to get
>                      assert(). I also try to avoid calling system apis
>             in header files.
>
>
>
>                           General minor comments:
>
>                           src/os/posix/vm/os_posix.cpp
>
>                           Indentation of wrapped call needs fixing
>
>
>                      Fixed
>
>                           ---
>
>                           src/share/vm/runtime/os.hpp
>
>                           Typo:
>
>                           +   //  sugested in the POSIX standard.
>
>                           -> suggested
>
>
>                      Fixed
>
>                           ---
>
>                           src/share/vm/utilities/vmError.cpp
>
>                           Good catch on the code that potentially
>             reports the wrong error
>                           using get_last_error! Even the simple
>             print_cr(...,
>                      strerror(errno),
>                           errno) could in fact print non-matching values
>             if strerror
>                      itself
>                           caused an error (unlikely of course :) ).
>
>
>                      Thanks :)
>
>
>                           Please check all copyright years -
>                      attachListener_solaris.cpp was
>                           updated to 2015 instead of 2016 :) Also note
>             correct format
>                      for two
>                           year copyrights - in
>             src/share/vm/logging/logFileOutput.cpp
>                      you have:
>
>                           Copyright (c) 2015,2016 Oracle
>
>                           but it should be:
>
>                           Copyright (c) 2015, 2016, Oracle
>
>                      Fixed
>
>                           Thanks,
>                           David
>
>
>                      Thanks!
>
>                      Thomas
>
>
>                               Kind Regards, Thomas
>
>
>
>
>                               On Tue, Mar 8, 2016 at 11:01 PM, Coleen
>             Phillimore
>                               <coleen.phillimore at oracle.com
>             <mailto:coleen.phillimore at oracle.com>
>                      <mailto:coleen.phillimore at oracle.com
>             <mailto:coleen.phillimore at oracle.com>>
>                               <mailto:coleen.phillimore at oracle.com
>             <mailto:coleen.phillimore at oracle.com>
>                      <mailto:coleen.phillimore at oracle.com
>             <mailto:coleen.phillimore at oracle.com>>>
>                               <mailto:coleen.phillimore at oracle.com
>             <mailto:coleen.phillimore at oracle.com>
>                      <mailto:coleen.phillimore at oracle.com
>             <mailto:coleen.phillimore at oracle.com>>
>
>                               <mailto:coleen.phillimore at oracle.com
>             <mailto:coleen.phillimore at oracle.com>
>                      <mailto:coleen.phillimore at oracle.com
>             <mailto:coleen.phillimore at oracle.com>>>>> wrote:
>
>
>                                    This latest version seems okay (or at
>             least better
>                      than the
>                               other
>                                    workarounds).
>
>                                    thanks,
>                                    Coleen
>
>
>                                    On 3/1/16 12:30 AM, David Holmes wrote:
>
>                                        Hi Thomas,
>
>                                        On 1/03/2016 3:23 AM, Thomas
>             Stüfe wrote:
>
>                                            Hi David,
>
>                                            On Mon, Feb 29, 2016 at 4:44
>             AM, David Holmes
>                                            <david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>
>                               <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>
>             <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>
>                               <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>>>
>
>             <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>
>                               <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>>
>
>
>             <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>
>                               <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>
>                      <mailto:david.holmes at oracle.com
>             <mailto:david.holmes at oracle.com>>>>>> wrote:
>
>                                                 Hi Thomas,
>
>                                                 On 27/02/2016 2:05 AM,
>             Thomas Stüfe
>                      wrote:
>
>                                                     Hi,
>
>                                                     please take a look
>             at this
>                      proposed fix:
>
>                                                     Bug:
>             https://bugs.openjdk.java.net/browse/JDK-8148425
>                                                     Webrev:
>             http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.00/webrev/
>
>                                                     This adds a
>             replacement function
>                               os::strerror() as
>                                            a drop-in for
>                                                     strerror(), which
>             has a number of
>                      issues.
>
>                                                     strerror() is unsafe
>             to use and
>                      may cause
>                               (and has
>                                            caused)
>                                                     crashes in
>                                                     multithreaded
>             scenarios. It is
>                      also not
>                               ideal for
>                                            log files
>                                                     because of the
>                                                     implicit
>             localization of the
>                      error messages.
>
>                                                     For details please
>             see the discussion
>                               under the bug
>                                            report.
>
>
>                                                 I just came across
>             strerror_l, which is
>                               required to be
>                                            thread-safe.
>                                                 Is that a possible
>             alternative? (I'm
>                      not sure how
>                                            locale's are
>                                                 obtained).
>
>
>                                            Sorry, I think this API is
>             glibc only. At
>                      least I
>                               cannot
>                                            find this in
>                                            our AIX headers, nor on Solaris.
>
>
>                                        It is a POSIX API:
>
>             http://pubs.opengroup.org/onlinepubs/9699919799/functions/strerror_l.html
>
>                                        added in 2006 but part of
>             POSIX.1-2008. But as per
>                               Dalibor's
>                                        link not necessarily available:
>
>                                        "This function is missing on some
>             platforms: glibc
>                               2.3.6, Mac OS
>                                        X 10.5, FreeBSD 6.0, NetBSD 5.0,
>             OpenBSD 3.8,
>                      Minix
>                               3.1.8, AIX
>                                        5.1, HP-UX 11, IRIX 6.5, OSF/1
>             5.1, Solaris 11
>                      2011-11,
>                               Cygwin,
>                                        mingw, MSVC 9, Interix 3.5, BeOS. "
>
>                                        Pity.
>
>
>                                                 Otherwise what you have
>             seems okay -
>                      though I do
>                                            dislike having to
>                                                 duplicate all that
>             details already
>                      buried in
>                               the system
>                                                 headers/library. Not
>             sure we need to
>                      the long
>                               text at
>                                            the VM level -
>                                                 which would simplify
>             things a bit.
>
>
>                                            I agree, I dislike this too.
>             Like everyone
>                      else in this
>                                            thread. But I
>                                            think this is a pragmatic
>             solution.
>
>                                            I am a bit stuck here -
>             should we really
>                      get rid of
>                               the long
>                                            text
>                                            feature? There are some
>             callsites of
>                      strerror() in the
>                                            hotspot where
>                                            arguably the long text is
>             better suited:
>                                            - in assert_status() (see
>             debug.hpp) -
>                      this ends up
>                               in the
>                                            header of
>                                            error files, if this suddenly
>             changes to a
>                      literalized
>                                            errno, people may
>                                            be upset
>
>
>                                        I added assert_status purely to
>             expand on what
>                      the failing
>                                        status was :) I'm happy to see
>             EINVAL in that
>                      context
>                               rather
>                                        than "invalid argument" (or
>             whatever) :)
>
>                                            - when failing to write a
>             heap dump file - see
>                                            services/heapDumper.cpp.
>                                            Which ends as printout on the
>             command
>                      line, I think.
>
>
>                                        Maybe ... if it is an error the
>             user can fix.
>
>                                            The safe option would be to
>             provide both
>                      variants
>                               (short and
>                                            long text).
>                                            Or, provide the safe, short
>             variant for
>                      all logging
>                               calls - when
>                                            "EINVAL" is enough, and let
>             users continue
>                      to use
>                               strerror()
>                                            for those
>                                            few cases where the long text
>             is needed.
>
>                                            What do you think?
>
>
>                                        Trying to decide what makes me
>             least unhappy.
>                      :) Given
>                               you've
>                                        already done the work I guess
>             having long and
>                      short
>                               forms is okay.
>
>                                        Thanks,
>                                        David
>
>                                            Thanks, Thomas
>
>                                                 Thanks,
>                                                 David
>
>
>                                                     Please note that I
>             did not yet
>                      change any call
>                                            sites, although
>                                                     all call
>                                                     sites in the os
>             namespace should
>                      already
>                               use the
>                                            new function. I
>                                                     wanted to
>                                                     see whether there
>             would be any
>                      general
>                               objections.
>
>                                                     Kind Regards, Thomas
>
>
>
>
>
>
>
>


More information about the hotspot-runtime-dev mailing list