RFR(s): 8148425: strerror() function is not thread-safe
David Holmes
david.holmes at oracle.com
Mon Mar 14 03:43:04 UTC 2016
On 12/03/2016 6:22 PM, Thomas Stüfe wrote:
> Hi David,
>
> thanks again for your review. Here the new version:
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.02/webrev/
>
> And the diff to the last version:
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.01-to-02.txt
>
> I fixed the indention, copyright comments and the typo.
All looks good - thanks.
> Further comments inline.
Ditto :)
>
> On Fri, Mar 11, 2016 at 8:04 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> Hi Thomas,
>
> Thanks for persevering with this. Sorry for the delayed response.
>
> On 10/03/2016 2:58 AM, Thomas Stüfe wrote:
>
> Hi all,
>
> please take a look at this next version of the fix.
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.01/webrev/
>
> I am the first to admit that this is quite ugly :(, but it is a
> pragmatic solution.
>
> Some details:
>
> - I added os::strerror() as a drop-in replacement for
> ::strerror() - it
> returns the text as before, but does not localize and does not
> handle
> unknown error values; so it can be implemented using a simple static
> string table and is always safe to use.
>
>
> Ok. But I don't think we need os::lasterror() if we have this. A
> follow up clean-up is okay for that though.
>
>
> I would love to keep this functionality, because I like printing out the
> short concise errno values, and there are a number of logging changes
> coming at least for AIX. It also does not cost much, as the large string
> table is already there.
>
> If the second API does not suit you, would it be accepable for you if I
> hide this functionality behind a second parameter in strerror,
> defaulting to false, like I did in my first version? E.g.
> os::strerror(int e, bool short_name=false); ?
That was what I was thinking of yes. :)
> But talking about clean-up, could we get rid of os::lasterror() ? I
> think that function is a bad idea, and it fortunately is only called in
> two places, both just need strerror() and no Win32 error handling anyway.
Sounds good. lasterror() is a fragile API as it is too easy to introduce
unexpected system calls between the call to lasterror and the action you
want to check for the error.
> I also would like to remove os::get_last_error(). It returns errno on
> *nix, win32 error on Windows. Both are different things, plus there is
> also an errno on Windows The return value of this function is useless
> without knowing what the returned error number actually is, so it is not
> of much use.
Simplifying and streamlining all this sounds good.
> Also it does seem to handle unknown error values.
>
> - I added os::errno_name() for those cases where one would want the
> literalized errno value (e.g. "EINVAL") instead of the full text.
>
>
> Ok. For the unknown case can it return the numeric value that it was
> passed?
>
>
> I would like to avoid this. This would make the function more complicated.
>
> I'd need to keep this thread-safe, so I either would have to allocate
> thread local storage for the return string somehow or, alternatively,
> keep a fixed string table a la [][] {"unknown (0)", "unknown(1)"} for
> the first n values or so.
Okay. I was thinking this would be simpler but of course we have to
convert to a string.
I note that linux defines ~133 errno values, and you have entries for
~75 - are they all the POSIX ones?
I'm wondering whether we should have a fallback strategy for an unknown
(platform-specific) error by using the platform strerror even though not
thread-safe (the window of exposure is vastly reduced) ?
> Also, ::strerror() behaves not consistently for the EINVAL case on all
> platforms anyway - on Linux and Bsd, it returns the numerical value, but
> on Solaris, Windows and AIX, it just returns "Unknown". Many callsites I
> saw take this already into account and print out the numerical errno
> value in addition to the error string.
I doubt those callsites are not taking into account unknown values - I
think you will find they originally only reported the numerical value
and were then augmented to also print the strerror message.
Thanks,
David
> - I fixed all callsites in the hotspot which used ::strerror to use
> os::strerror, or, in some few cases (mostly my own coding for AIX),
> os::errno_name().
>
>
> Bit subjective but okay - I personally don't care either way. :)
>
> - To avoid including os.hpp in debug.hpp, I moved the
> implemntation of
> assert_status to debug.cpp
>
>
> Okay ... did the include cause a problem?
>
>
> os.hpp is a large beast. I wanted to keep it out of debug.hpp, to not
> force this include upon anyone who includes debug.hpp just to get
> assert(). I also try to avoid calling system apis in header files.
>
>
>
> General minor comments:
>
> src/os/posix/vm/os_posix.cpp
>
> Indentation of wrapped call needs fixing
>
>
> Fixed
>
> ---
>
> src/share/vm/runtime/os.hpp
>
> Typo:
>
> + // sugested in the POSIX standard.
>
> -> suggested
>
>
> Fixed
>
> ---
>
> src/share/vm/utilities/vmError.cpp
>
> Good catch on the code that potentially reports the wrong error
> using get_last_error! Even the simple print_cr(..., strerror(errno),
> errno) could in fact print non-matching values if strerror itself
> caused an error (unlikely of course :) ).
>
>
> Thanks :)
>
>
> Please check all copyright years - attachListener_solaris.cpp was
> updated to 2015 instead of 2016 :) Also note correct format for two
> year copyrights - in src/share/vm/logging/logFileOutput.cpp you have:
>
> Copyright (c) 2015,2016 Oracle
>
> but it should be:
>
> Copyright (c) 2015, 2016, Oracle
>
> Fixed
>
> Thanks,
> David
>
>
> Thanks!
>
> Thomas
>
>
> Kind Regards, Thomas
>
>
>
>
> On Tue, Mar 8, 2016 at 11:01 PM, Coleen Phillimore
> <coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>
> <mailto:coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>>> wrote:
>
>
> This latest version seems okay (or at least better than the
> other
> workarounds).
>
> thanks,
> Coleen
>
>
> On 3/1/16 12:30 AM, David Holmes wrote:
>
> Hi Thomas,
>
> On 1/03/2016 3:23 AM, Thomas Stüfe wrote:
>
> Hi David,
>
> On Mon, Feb 29, 2016 at 4:44 AM, David Holmes
> <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>>> wrote:
>
> Hi Thomas,
>
> On 27/02/2016 2:05 AM, Thomas Stüfe wrote:
>
> Hi,
>
> please take a look at this proposed fix:
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8148425
> Webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.00/webrev/
>
> This adds a replacement function
> os::strerror() as
> a drop-in for
> strerror(), which has a number of issues.
>
> strerror() is unsafe to use and may cause
> (and has
> caused)
> crashes in
> multithreaded scenarios. It is also not
> ideal for
> log files
> because of the
> implicit localization of the error messages.
>
> For details please see the discussion
> under the bug
> report.
>
>
> I just came across strerror_l, which is
> required to be
> thread-safe.
> Is that a possible alternative? (I'm not sure how
> locale's are
> obtained).
>
>
> Sorry, I think this API is glibc only. At least I
> cannot
> find this in
> our AIX headers, nor on Solaris.
>
>
> It is a POSIX API:
>
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/strerror_l.html
>
> added in 2006 but part of POSIX.1-2008. But as per
> Dalibor's
> link not necessarily available:
>
> "This function is missing on some platforms: glibc
> 2.3.6, Mac OS
> X 10.5, FreeBSD 6.0, NetBSD 5.0, OpenBSD 3.8, Minix
> 3.1.8, AIX
> 5.1, HP-UX 11, IRIX 6.5, OSF/1 5.1, Solaris 11 2011-11,
> Cygwin,
> mingw, MSVC 9, Interix 3.5, BeOS. "
>
> Pity.
>
>
> Otherwise what you have seems okay - though I do
> dislike having to
> duplicate all that details already buried in
> the system
> headers/library. Not sure we need to the long
> text at
> the VM level -
> which would simplify things a bit.
>
>
> I agree, I dislike this too. Like everyone else in this
> thread. But I
> think this is a pragmatic solution.
>
> I am a bit stuck here - should we really get rid of
> the long
> text
> feature? There are some callsites of strerror() in the
> hotspot where
> arguably the long text is better suited:
> - in assert_status() (see debug.hpp) - this ends up
> in the
> header of
> error files, if this suddenly changes to a literalized
> errno, people may
> be upset
>
>
> I added assert_status purely to expand on what the failing
> status was :) I'm happy to see EINVAL in that context
> rather
> than "invalid argument" (or whatever) :)
>
> - when failing to write a heap dump file - see
> services/heapDumper.cpp.
> Which ends as printout on the command line, I think.
>
>
> Maybe ... if it is an error the user can fix.
>
> The safe option would be to provide both variants
> (short and
> long text).
> Or, provide the safe, short variant for all logging
> calls - when
> "EINVAL" is enough, and let users continue to use
> strerror()
> for those
> few cases where the long text is needed.
>
> What do you think?
>
>
> Trying to decide what makes me least unhappy. :) Given
> you've
> already done the work I guess having long and short
> forms is okay.
>
> Thanks,
> David
>
> Thanks, Thomas
>
> Thanks,
> David
>
>
> Please note that I did not yet change any call
> sites, although
> all call
> sites in the os namespace should already
> use the
> new function. I
> wanted to
> see whether there would be any general
> objections.
>
> Kind Regards, Thomas
>
>
>
>
>
More information about the hotspot-runtime-dev
mailing list