RFR(s): 8148425: strerror() function is not thread-safe

Thomas Stüfe thomas.stuefe at gmail.com
Sat Mar 12 08:22:10 UTC 2016


Hi David,

thanks again for your review. Here the new version:

http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.02/webrev/

And the diff to the last version:

http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.01-to-02.txt

I fixed the indention, copyright comments and the typo.

Further comments inline.


On Fri, Mar 11, 2016 at 8:04 AM, David Holmes <david.holmes at oracle.com>
wrote:

> Hi Thomas,
>
> Thanks for persevering with this. Sorry for the delayed response.
>
> On 10/03/2016 2:58 AM, Thomas Stüfe wrote:
>
>> Hi all,
>>
>> please take a look at this next version of the fix.
>>
>>
>> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.01/webrev/
>>
>> I am the first to admit that this is quite ugly :(, but it is a
>> pragmatic solution.
>>
>> Some details:
>>
>> - I added os::strerror() as a drop-in replacement for ::strerror() - it
>> returns the text as before, but does not localize and does not handle
>> unknown error values; so it can be implemented using a simple static
>> string table and is always safe to use.
>>
>
> Ok. But I don't think we need os::lasterror() if we have this. A follow up
> clean-up is okay for that though.
>
>
I would love to keep this functionality, because I like printing out the
short concise errno values, and there are a number of logging changes
coming at least for AIX. It also does not cost much, as the large string
table is already there.

If the second API does not suit you, would it be accepable for you if I
hide this functionality behind a second parameter in strerror, defaulting
to false, like I did in my first version? E.g. os::strerror(int e, bool
short_name=false); ?

But talking about clean-up, could we get rid of os::lasterror() ? I think
that function is a bad idea, and it fortunately is only called in two
places, both just need strerror() and no Win32 error handling anyway.

I also would like to remove os::get_last_error(). It returns errno on *nix,
win32 error on Windows. Both are different things, plus there is also an
errno on Windows The return value of this function is useless without
knowing what the returned error number actually is, so it is not of much
use.

Also it does seem to handle unknown error values.
>
> - I added os::errno_name() for those cases where one would want the
>> literalized errno value (e.g. "EINVAL") instead of the full text.
>>
>
> Ok. For the unknown case can it return the numeric value that it was
> passed?
>
>
I would like to avoid this. This would make the function more complicated.

I'd need to keep this thread-safe, so I either would have to allocate
thread local storage for the return string somehow or, alternatively, keep
a fixed string table a la [][] {"unknown (0)", "unknown(1)"} for the first
n values or so.

Also, ::strerror() behaves not consistently for the EINVAL case on all
platforms anyway - on Linux and Bsd, it returns the numerical value, but on
Solaris, Windows and AIX, it just returns "Unknown". Many callsites I saw
take this already into account and print out the numerical errno value in
addition to the error string.

- I fixed all callsites in the hotspot which used ::strerror to use
>> os::strerror, or, in some few cases (mostly my own coding for AIX),
>> os::errno_name().
>>
>
> Bit subjective but okay - I personally don't care either way. :)
>
> - To avoid including os.hpp in debug.hpp, I moved the implemntation of
>> assert_status to debug.cpp
>>
>
> Okay ... did the include cause a problem?
>

os.hpp is a large beast. I wanted to keep it out of debug.hpp, to not force
this include upon anyone who includes debug.hpp just to get assert(). I
also try to avoid calling system apis in header files.


>
>
> General minor comments:
>
> src/os/posix/vm/os_posix.cpp
>
> Indentation of wrapped call needs fixing
>
>
Fixed


> ---
>
> src/share/vm/runtime/os.hpp
>
> Typo:
>
> +   //  sugested in the POSIX standard.
>
> -> suggested
>
>
Fixed


> ---
>
> src/share/vm/utilities/vmError.cpp
>
> Good catch on the code that potentially reports the wrong error using
> get_last_error! Even the simple print_cr(..., strerror(errno), errno) could
> in fact print non-matching values if strerror itself caused an error
> (unlikely of course :) ).
>
>
Thanks :)


> Please check all copyright years - attachListener_solaris.cpp was updated
> to 2015 instead of 2016 :) Also note correct format for two year copyrights
> - in src/share/vm/logging/logFileOutput.cpp you have:
>
> Copyright (c) 2015,2016 Oracle
>
> but it should be:
>
> Copyright (c) 2015, 2016, Oracle
>
>
Fixed


> Thanks,
> David
>
>
Thanks!

Thomas


>
> Kind Regards, Thomas
>>
>>
>>
>>
>> On Tue, Mar 8, 2016 at 11:01 PM, Coleen Phillimore
>> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>>
>> wrote:
>>
>>
>>     This latest version seems okay (or at least better than the other
>>     workarounds).
>>
>>     thanks,
>>     Coleen
>>
>>
>>     On 3/1/16 12:30 AM, David Holmes wrote:
>>
>>         Hi Thomas,
>>
>>         On 1/03/2016 3:23 AM, Thomas Stüfe wrote:
>>
>>             Hi David,
>>
>>             On Mon, Feb 29, 2016 at 4:44 AM, David Holmes
>>             <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>>             <mailto:david.holmes at oracle.com
>>
>>             <mailto:david.holmes at oracle.com>>> wrote:
>>
>>                  Hi Thomas,
>>
>>                  On 27/02/2016 2:05 AM, Thomas Stüfe wrote:
>>
>>                      Hi,
>>
>>                      please take a look at this proposed fix:
>>
>>                      Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8148425
>>                      Webrev:
>>
>> http://cr.openjdk.java.net/~stuefe/webrevs/8141425-strerror-replacement/webrev.00/webrev/
>>
>>                      This adds a replacement function os::strerror() as
>>             a drop-in for
>>                      strerror(), which has a number of issues.
>>
>>                      strerror() is unsafe to use and may cause (and has
>>             caused)
>>                      crashes in
>>                      multithreaded scenarios. It is also not ideal for
>>             log files
>>                      because of the
>>                      implicit localization of the error messages.
>>
>>                      For details please see the discussion under the bug
>>             report.
>>
>>
>>                  I just came across strerror_l, which is required to be
>>             thread-safe.
>>                  Is that a possible alternative? (I'm not sure how
>>             locale's are
>>                  obtained).
>>
>>
>>             Sorry, I think this API is glibc only. At least I cannot
>>             find this in
>>             our AIX headers, nor on Solaris.
>>
>>
>>         It is a POSIX API:
>>
>>
>> http://pubs.opengroup.org/onlinepubs/9699919799/functions/strerror_l.html
>>
>>         added in 2006 but part of POSIX.1-2008. But as per Dalibor's
>>         link not necessarily available:
>>
>>         "This function is missing on some platforms: glibc 2.3.6, Mac OS
>>         X 10.5, FreeBSD 6.0, NetBSD 5.0, OpenBSD 3.8, Minix 3.1.8, AIX
>>         5.1, HP-UX 11, IRIX 6.5, OSF/1 5.1, Solaris 11 2011-11, Cygwin,
>>         mingw, MSVC 9, Interix 3.5, BeOS. "
>>
>>         Pity.
>>
>>
>>                  Otherwise what you have seems okay - though I do
>>             dislike having to
>>                  duplicate all that details already buried in the system
>>                  headers/library. Not sure we need to the long text at
>>             the VM level -
>>                  which would simplify things a bit.
>>
>>
>>             I agree, I dislike this too. Like everyone else in this
>>             thread. But I
>>             think this is a pragmatic solution.
>>
>>             I am a bit stuck here - should we really get rid of the long
>>             text
>>             feature? There are some callsites of strerror() in the
>>             hotspot where
>>             arguably the long text is better suited:
>>             - in assert_status() (see debug.hpp) - this ends up in the
>>             header of
>>             error files, if this suddenly changes to a literalized
>>             errno, people may
>>             be upset
>>
>>
>>         I added assert_status purely to expand on what the failing
>>         status was :) I'm happy to see EINVAL in that context rather
>>         than "invalid argument" (or whatever) :)
>>
>>             - when failing to write a heap dump file - see
>>             services/heapDumper.cpp.
>>             Which ends as printout on the command line, I think.
>>
>>
>>         Maybe ... if it is an error the user can fix.
>>
>>             The safe option would be to provide both variants (short and
>>             long text).
>>             Or, provide the safe, short variant for all logging calls -
>> when
>>             "EINVAL" is enough, and let users continue to use strerror()
>>             for those
>>             few cases where the long text is needed.
>>
>>             What do you think?
>>
>>
>>         Trying to decide what makes me least unhappy. :) Given you've
>>         already done the work I guess having long and short forms is okay.
>>
>>         Thanks,
>>         David
>>
>>             Thanks, Thomas
>>
>>                  Thanks,
>>                  David
>>
>>
>>                      Please note that I did not yet change any call
>>             sites, although
>>                      all call
>>                      sites in the os namespace should already use the
>>             new function. I
>>                      wanted to
>>                      see whether there would be any general objections.
>>
>>                      Kind Regards, Thomas
>>
>>
>>
>>
>>


More information about the hotspot-runtime-dev mailing list