RFR(s): 8148425: strerror() function is not thread-safe
Coleen Phillimore
coleen.phillimore at oracle.com
Tue Mar 15 16:48:24 UTC 2016
On 3/15/16 12:02 PM, Thomas Stüfe wrote:
> Coleen, David,
>
> thank you both!
>
> I will do a follow up change later when there is more time.
Hi Thomas,
Feature complete for jdk9 is coming soon and we'll stop wanting to
sponsor RFEs, unless they are high priority by some definition. The
date for Hotspot is May 12 to allow for integration and testing.
Also with Jigsaw integrating soon, we're trying to limit the changes
that it needs to merge with. I think the Jigsaw changes were sent to
the OpenJDK last week.
thanks,
Coleen
>
> Kind regards, Thomas
>
>
>
> On Tue, Mar 15, 2016 at 4:02 PM, Coleen Phillimore
> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>>
> wrote:
>
>
> This looks fine to me, as-is.
>
> On 3/15/16 8:31 AM, David Holmes wrote:
>
> Hi Thomas,
>
> On 15/03/2016 8:07 PM, Thomas Stüfe wrote:
>
> Hi David,
>
> sorry, I will have not the time to work on this this week.
> If the patch
> is acceptable to you in this form, I'd say we commit it,
> otherwise I'll
> take it up again some time later.
>
>
> If Coleen is okay with this as-is I'll commit it in the morning.
>
> About the suggested changes:
>
> - Merging both APIs into one, like suggested, is fine for
> me and I would
> do this in a follow-up.
>
>
> Sure.
>
> - Handling unknown errno values: I do not like the idea of
> a fallback to
> ::strerror(). Writing the API comment on that one would be
> weird: "Is
> almost threadsafe", "will probably return a static
> constant string"...
> :) I would suggest either using the solution as it is now
> - only
> honouring POSIX errno values or returning "Unknown", or to
> expand them
> with OS specific lists, either in the OS independent form
> ("#ifdef EBLUB
> handle(EBLUB) #endif") or by expanding os::strerror() with
> platform
> dependend os::strerror_pd().
>
> I am pretty unemotional about either option but would
> prefer avoiding
> ::strerror() altogether.
>
>
> A follow up on this okay. I'm unsure how best to handle it.
> Given how rarely we encounter errors I'm probably
> over-thinking it. ;)
>
>
> Yeah, I think you're over-thinking it. The two APIs with comments
> make sense to me, and something returning Unknown isn't a big deal
> in real life.
>
> Thanks for the change and thanks for sponsoring this, David.
>
> Coleen
>
>
> Thanks,
> David
>
> Kind Regards, Thomas
>
>
>
> On Mon, Mar 14, 2016 at 4:43 AM, David Holmes
> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>> wrote:
>
> On 12/03/2016 6:22 PM, Thomas Stüfe wrote:
>
> Hi David,
>
> thanks again for your review. Here the new version:
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.02/webrev/
> <http://cr.openjdk.java.net/%7Estuefe/webrevs/8141425-strerror-replacement/webrev.02/webrev/>
>
> And the diff to the last version:
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.01-to-02.txt
> <http://cr.openjdk.java.net/%7Estuefe/webrevs/8141425-strerror-replacement/webrev.01-to-02.txt>
>
> I fixed the indention, copyright comments and the
> typo.
>
>
> All looks good - thanks.
>
> Further comments inline.
>
>
> Ditto :)
>
>
> On Fri, Mar 11, 2016 at 8:04 AM, David Holmes
> <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>>> wrote:
>
> Hi Thomas,
>
> Thanks for persevering with this. Sorry for
> the delayed
> response.
>
> On 10/03/2016 2:58 AM, Thomas Stüfe wrote:
>
> Hi all,
>
> please take a look at this next version
> of the fix.
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.01/webrev/
> <http://cr.openjdk.java.net/%7Estuefe/webrevs/8141425-strerror-replacement/webrev.01/webrev/>
>
> I am the first to admit that this is
> quite ugly :(, but
> it is a
> pragmatic solution.
>
> Some details:
>
> - I added os::strerror() as a drop-in
> replacement for
> ::strerror() - it
> returns the text as before, but does not
> localize and
> does not
> handle
> unknown error values; so it can be
> implemented using a
> simple static
> string table and is always safe to use.
>
>
> Ok. But I don't think we need os::lasterror()
> if we have
> this. A
> follow up clean-up is okay for that though.
>
>
> I would love to keep this functionality, because I
> like printing
> out the
> short concise errno values, and there are a number
> of logging
> changes
> coming at least for AIX. It also does not cost
> much, as the
> large string
> table is already there.
>
> If the second API does not suit you, would it be
> accepable for
> you if I
> hide this functionality behind a second parameter
> in strerror,
> defaulting to false, like I did in my first
> version? E.g.
> os::strerror(int e, bool short_name=false); ?
>
>
> That was what I was thinking of yes. :)
>
> But talking about clean-up, could we get rid of
> os::lasterror() ? I
> think that function is a bad idea, and it
> fortunately is only
> called in
> two places, both just need strerror() and no Win32
> error
> handling anyway.
>
>
> Sounds good. lasterror() is a fragile API as it is too
> easy to
> introduce unexpected system calls between the call to
> lasterror and
> the action you want to check for the error.
>
> I also would like to remove os::get_last_error().
> It returns
> errno on
> *nix, win32 error on Windows. Both are different
> things, plus
> there is
> also an errno on Windows The return value of this
> function is
> useless
> without knowing what the returned error number
> actually is, so
> it is not
> of much use.
>
>
> Simplifying and streamlining all this sounds good.
>
> Also it does seem to handle unknown error values.
>
> - I added os::errno_name() for those
> cases where one
> would want the
> literalized errno value (e.g. "EINVAL")
> instead of the
> full text.
>
>
> Ok. For the unknown case can it return the
> numeric value
> that it was
> passed?
>
>
> I would like to avoid this. This would make the
> function more
> complicated.
>
> I'd need to keep this thread-safe, so I either
> would have to
> allocate
> thread local storage for the return string somehow or,
> alternatively,
> keep a fixed string table a la [][] {"unknown (0)",
> "unknown(1)"} for
> the first n values or so.
>
>
> Okay. I was thinking this would be simpler but of
> course we have to
> convert to a string.
>
> I note that linux defines ~133 errno values, and you
> have entries
> for ~75 - are they all the POSIX ones?
>
> I'm wondering whether we should have a fallback
> strategy for an
> unknown (platform-specific) error by using the
> platform strerror
> even though not thread-safe (the window of exposure is
> vastly reduced) ?
>
> Also, ::strerror() behaves not consistently for
> the EINVAL case
> on all
> platforms anyway - on Linux and Bsd, it returns
> the numerical
> value, but
> on Solaris, Windows and AIX, it just returns
> "Unknown". Many
> callsites I
> saw take this already into account and print out
> the numerical errno
> value in addition to the error string.
>
>
> I doubt those callsites are not taking into account
> unknown values -
> I think you will find they originally only reported
> the numerical
> value and were then augmented to also print the
> strerror message.
>
> Thanks,
> David
>
> - I fixed all callsites in the hotspot
> which used
> ::strerror to use
> os::strerror, or, in some few cases
> (mostly my own
> coding for AIX),
> os::errno_name().
>
>
> Bit subjective but okay - I personally don't
> care either
> way. :)
>
> - To avoid including os.hpp in debug.hpp,
> I moved the
> implemntation of
> assert_status to debug.cpp
>
>
> Okay ... did the include cause a problem?
>
>
> os.hpp is a large beast. I wanted to keep it out
> of debug.hpp,
> to not
> force this include upon anyone who includes
> debug.hpp just to get
> assert(). I also try to avoid calling system apis
> in header files.
>
>
>
> General minor comments:
>
> src/os/posix/vm/os_posix.cpp
>
> Indentation of wrapped call needs fixing
>
>
> Fixed
>
> ---
>
> src/share/vm/runtime/os.hpp
>
> Typo:
>
> + // sugested in the POSIX standard.
>
> -> suggested
>
>
> Fixed
>
> ---
>
> src/share/vm/utilities/vmError.cpp
>
> Good catch on the code that potentially
> reports the wrong error
> using get_last_error! Even the simple
> print_cr(...,
> strerror(errno),
> errno) could in fact print non-matching
> values if strerror
> itself
> caused an error (unlikely of course :) ).
>
>
> Thanks :)
>
>
> Please check all copyright years -
> attachListener_solaris.cpp was
> updated to 2015 instead of 2016 :) Also note
> correct format
> for two
> year copyrights - in
> src/share/vm/logging/logFileOutput.cpp
> you have:
>
> Copyright (c) 2015,2016 Oracle
>
> but it should be:
>
> Copyright (c) 2015, 2016, Oracle
>
> Fixed
>
> Thanks,
> David
>
>
> Thanks!
>
> Thomas
>
>
> Kind Regards, Thomas
>
>
>
>
> On Tue, Mar 8, 2016 at 11:01 PM, Coleen
> Phillimore
> <coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>
> <mailto:coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>>
> <mailto:coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>
> <mailto:coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>>>
> <mailto:coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>
> <mailto:coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>>
>
> <mailto:coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>
> <mailto:coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>>>>> wrote:
>
>
> This latest version seems okay (or
> at least better
> than the
> other
> workarounds).
>
> thanks,
> Coleen
>
>
> On 3/1/16 12:30 AM, David Holmes wrote:
>
> Hi Thomas,
>
> On 1/03/2016 3:23 AM, Thomas
> Stüfe wrote:
>
> Hi David,
>
> On Mon, Feb 29, 2016 at 4:44
> AM, David Holmes
> <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>>>
>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>>
>
>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>>>>> wrote:
>
> Hi Thomas,
>
> On 27/02/2016 2:05 AM,
> Thomas Stüfe
> wrote:
>
> Hi,
>
> please take a look
> at this
> proposed fix:
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8148425
> Webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.00/webrev/
> <http://cr.openjdk.java.net/%7Estuefe/webrevs/8141425-strerror-replacement/webrev.00/webrev/>
>
> This adds a
> replacement function
> os::strerror() as
> a drop-in for
> strerror(), which
> has a number of
> issues.
>
> strerror() is
> unsafe to use and
> may cause
> (and has
> caused)
> crashes in
> multithreaded
> scenarios. It is
> also not
> ideal for
> log files
> because of the
> implicit
> localization of the
> error messages.
>
> For details please
> see the discussion
> under the bug
> report.
>
>
> I just came across
> strerror_l, which is
> required to be
> thread-safe.
> Is that a possible
> alternative? (I'm
> not sure how
> locale's are
> obtained).
>
>
> Sorry, I think this API is
> glibc only. At
> least I
> cannot
> find this in
> our AIX headers, nor on Solaris.
>
>
> It is a POSIX API:
>
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/strerror_l.html
>
> added in 2006 but part of
> POSIX.1-2008. But as per
> Dalibor's
> link not necessarily available:
>
> "This function is missing on
> some platforms: glibc
> 2.3.6, Mac OS
> X 10.5, FreeBSD 6.0, NetBSD 5.0,
> OpenBSD 3.8,
> Minix
> 3.1.8, AIX
> 5.1, HP-UX 11, IRIX 6.5, OSF/1
> 5.1, Solaris 11
> 2011-11,
> Cygwin,
> mingw, MSVC 9, Interix 3.5, BeOS. "
>
> Pity.
>
>
> Otherwise what you have
> seems okay -
> though I do
> dislike having to
> duplicate all that
> details already
> buried in
> the system
> headers/library. Not
> sure we need to
> the long
> text at
> the VM level -
> which would simplify
> things a bit.
>
>
> I agree, I dislike this too.
> Like everyone
> else in this
> thread. But I
> think this is a pragmatic
> solution.
>
> I am a bit stuck here -
> should we really
> get rid of
> the long
> text
> feature? There are some
> callsites of
> strerror() in the
> hotspot where
> arguably the long text is
> better suited:
> - in assert_status() (see
> debug.hpp) -
> this ends up
> in the
> header of
> error files, if this
> suddenly changes to a
> literalized
> errno, people may
> be upset
>
>
> I added assert_status purely to
> expand on what
> the failing
> status was :) I'm happy to see
> EINVAL in that
> context
> rather
> than "invalid argument" (or
> whatever) :)
>
> - when failing to write a
> heap dump file - see
> services/heapDumper.cpp.
> Which ends as printout on
> the command
> line, I think.
>
>
> Maybe ... if it is an error the
> user can fix.
>
> The safe option would be to
> provide both
> variants
> (short and
> long text).
> Or, provide the safe, short
> variant for
> all logging
> calls - when
> "EINVAL" is enough, and let
> users continue
> to use
> strerror()
> for those
> few cases where the long
> text is needed.
>
> What do you think?
>
>
> Trying to decide what makes me
> least unhappy.
> :) Given
> you've
> already done the work I guess
> having long and
> short
> forms is okay.
>
> Thanks,
> David
>
> Thanks, Thomas
>
> Thanks,
> David
>
>
> Please note that I
> did not yet
> change any call
> sites, although
> all call
> sites in the os
> namespace should
> already
> use the
> new function. I
> wanted to
> see whether there
> would be any
> general
> objections.
>
> Kind Regards, Thomas
>
>
>
>
>
>
>
>
More information about the hotspot-runtime-dev
mailing list