RFR: 8087322: Implement a Semaphore utility class

Mon Jun 15 11:32:33 UTC 2015

On 15/06/2015 7:23 PM, Kim Barrett wrote:
> On Jun 15, 2015, at 3:17 AM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Stefan,
>>
>> On 15/06/2015 4:40 PM, Stefan Karlsson wrote:
>>> Hi David,
>>>
>>> On 2015-06-15 01:47, David Holmes wrote:
>>>> Hi Stefan,
>>>>
>>>> Quick response - sorry will try to get back to this later.
>>>>
>>>> I second the comment to define an os_posix implementation and to do it
>>>> now rather than defer it for future work. Solaris supports POSIX
>>>> semaphore API (as well as the SUS/UI semaphore API). This may boil
>>>> down to only two implementations: posix and win32.
>>>
>>> This adds risk to this patch. Are we sure that the Solaris OS code
>>> doesn't rely on some implementation detail of sema_post/sema_wait that
>>> is different from how the POSIX semaphores work? For example, how it
>>> interacts with signal handlers, etc?
>>
>> Solaris supports two different semaphore API's. The POSIX API has sem_wait, sem_post etc and operates on sem_t. The other (SUS or UI or Solaris-only ?) API has sema_* and operates on sema_t - it is a very subtle difference: sem vs sema. Under the covers they are the same thing, but the API's themselves have some differences. You can use the POSIX API on Solaris, just as on Linux and hopefully BSD (and maybe AIX?)
>>
>>>>
>>>> I don't see the point of the "max" value in the API given it can't
>>>> used on all platforms.
>>>
>>> I don't want it either, but it's needed by the windows implementation. I
>>> could get rid of it if we have a sensible max value. Any suggestion on
>>> what value to use?
>>
>> On Windows pass in whatever MAX windows defines.
>>
>>>>
>>>> The posix semaphore functions return -1 and set errno on error, so any
>>>> guarantees (asserts are normally used here)
>>>
>>> I can change to asserts.
>>>
>>>> should do the appropriate errno string conversion.
>>>
>>> I followed the surrounding code. For example:
>>>
>>> 3411       jio_snprintf(msg, sizeof(msg), "Failed to reserve shared memory (errno = %d).", errno);
>>>
>>> but I see that we print a string at other places. I can change it.
>>
>> Thanks.
>>
>>>
>>> BTW, while looking at the return values of sema_wait, I found the following:
>>>
>>> 2433       while ((ret = ::sema_wait(&sig_sem)) == EINTR)
>>> 2434         ;
>>>
>>> I wonder if this code is broken. Since it checks the return value against EINTR and not errno.
>>
>> sema_wait returns zero or an error code. POSIX sem_wait returns 0/-1 and sets errno.
>
> This is exactly the question / confusion that Stefan and I discussed
> earlier in this review thread.  The documentation is unclear.
>
> The Solaris 11.2 documentation for the sema_xxx functions, describes
> the return values as:
>
>      Upon successful completion, 0 is returned; otherwise, a non-zero
>      value indicates an error.
>
> That doesn't specify that a non-zero return value can be interpreted
> as an error code.  It might empirically (or as a matter of lore) be
> that it can, but it doesn't state that.  (It's entirely possible I'm
> not aware of some overarching documentation somewhere that makes such
> a statement; I'm not that familiar with Solaris.)

But is immediately followed by:

ERRORS
      These functions will fail if:

      EINVAL     The sp argument does not refer to a  valid  sema-
                 phore.

but I agree it is less than clear these are the returned values

> The documentation further contains examples, and those examples show
> the error code associated with a failure (a non-zero return value)
> being obtained from errno, not from the non-zero return value.  This
> also argues against the return value being an error code; why report
> it in multiple places?

I think you will find that this is a simple copy'n'paste error between 
the sem_ and sema_ manpages.

> Of course, one of the benefits of using the POSIX API would be that it
> eliminates such questions, as that API’s documentation is quite clear
> on this matter.
>
>>>> Defining operations as Unimplemented on Windows is bad form. I see no
>>>> reason these should be unimplemented as WaitForSingleObject handles
>>>> the required semantics.
>>>
>>> But that means adding dead/untested code. That's another kind of bad form.
>>
>> Can you elaborate on why only part of the API is being used on Windows? I would have hoped the client code was cross-platform.
>
> Only part of the API is being used on *any* platform at present.  The
> parts of the API that haven’t been implemented for Windows currently
> have no uses in our code base.  The simplest solution would be to
> eliminate (for all ports) the unused part of the API, and add it back
> if someone actually needs it in the future.

Okay in that case it's either all in or none in. But certainly there 
should not be a partially implemented API on some platforms.

>>>> I'm not a fan of the creeping use of "unit tests" inside the main
>>>> sources - have I missed some memo on that?
>>>
>>> We have been doing that for many years now.
>>
>> "we" have? I've only noticed a few occurrences :)
>
> I’m not a fan of the approach we have, but at present it is what we have.
> I’d rather have them than not; they get exercised as part of our normal
> testing processes, and that’s a good thing.

The "internaltests" ?

Not a fan, but more on that in response to Erik. :)

Cheers,
David