RFR: 8087322: Implement a Semaphore utility class

Mon Jun 15 09:23:13 UTC 2015

On Jun 15, 2015, at 3:17 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Hi Stefan,
> 
> On 15/06/2015 4:40 PM, Stefan Karlsson wrote:
>> Hi David,
>> 
>> On 2015-06-15 01:47, David Holmes wrote:
>>> Hi Stefan,
>>> 
>>> Quick response - sorry will try to get back to this later.
>>> 
>>> I second the comment to define an os_posix implementation and to do it
>>> now rather than defer it for future work. Solaris supports POSIX
>>> semaphore API (as well as the SUS/UI semaphore API). This may boil
>>> down to only two implementations: posix and win32.
>> 
>> This adds risk to this patch. Are we sure that the Solaris OS code
>> doesn't rely on some implementation detail of sema_post/sema_wait that
>> is different from how the POSIX semaphores work? For example, how it
>> interacts with signal handlers, etc?
> 
> Solaris supports two different semaphore API's. The POSIX API has sem_wait, sem_post etc and operates on sem_t. The other (SUS or UI or Solaris-only ?) API has sema_* and operates on sema_t - it is a very subtle difference: sem vs sema. Under the covers they are the same thing, but the API's themselves have some differences. You can use the POSIX API on Solaris, just as on Linux and hopefully BSD (and maybe AIX?)
> 
>>> 
>>> I don't see the point of the "max" value in the API given it can't
>>> used on all platforms.
>> 
>> I don't want it either, but it's needed by the windows implementation. I
>> could get rid of it if we have a sensible max value. Any suggestion on
>> what value to use?
> 
> On Windows pass in whatever MAX windows defines.
> 
>>> 
>>> The posix semaphore functions return -1 and set errno on error, so any
>>> guarantees (asserts are normally used here)
>> 
>> I can change to asserts.
>> 
>>> should do the appropriate errno string conversion.
>> 
>> I followed the surrounding code. For example:
>> 
>> 3411       jio_snprintf(msg, sizeof(msg), "Failed to reserve shared memory (errno = %d).", errno);
>> 
>> but I see that we print a string at other places. I can change it.
> 
> Thanks.
> 
>> 
>> BTW, while looking at the return values of sema_wait, I found the following:
>> 
>> 2433       while ((ret = ::sema_wait(&sig_sem)) == EINTR)
>> 2434         ;
>> 
>> I wonder if this code is broken. Since it checks the return value against EINTR and not errno.
> 
> sema_wait returns zero or an error code. POSIX sem_wait returns 0/-1 and sets errno.

This is exactly the question / confusion that Stefan and I discussed
earlier in this review thread.  The documentation is unclear.

The Solaris 11.2 documentation for the sema_xxx functions, describes
the return values as:

    Upon successful completion, 0 is returned; otherwise, a non-zero
    value indicates an error.

That doesn't specify that a non-zero return value can be interpreted
as an error code.  It might empirically (or as a matter of lore) be
that it can, but it doesn't state that.  (It's entirely possible I'm
not aware of some overarching documentation somewhere that makes such
a statement; I'm not that familiar with Solaris.)

The documentation further contains examples, and those examples show
the error code associated with a failure (a non-zero return value)
being obtained from errno, not from the non-zero return value.  This
also argues against the return value being an error code; why report
it in multiple places?

Of course, one of the benefits of using the POSIX API would be that it
eliminates such questions, as that API’s documentation is quite clear
on this matter.

>>> Defining operations as Unimplemented on Windows is bad form. I see no
>>> reason these should be unimplemented as WaitForSingleObject handles
>>> the required semantics.
>> 
>> But that means adding dead/untested code. That's another kind of bad form.
> 
> Can you elaborate on why only part of the API is being used on Windows? I would have hoped the client code was cross-platform.

Only part of the API is being used on *any* platform at present.  The
parts of the API that haven’t been implemented for Windows currently
have no uses in our code base.  The simplest solution would be to
eliminate (for all ports) the unused part of the API, and add it back
if someone actually needs it in the future.

>>> I'm not a fan of the creeping use of "unit tests" inside the main
>>> sources - have I missed some memo on that?
>> 
>> We have been doing that for many years now.
> 
> "we" have? I've only noticed a few occurrences :)

I’m not a fan of the approach we have, but at present it is what we have.
I’d rather have them than not; they get exercised as part of our normal
testing processes, and that’s a good thing.