RFR: 8087322: Implement a Semaphore utility class

Fri Jun 12 18:45:08 UTC 2015

Hi Kim,

On 2015-06-12 20:08, Kim Barrett wrote:
> On Jun 12, 2015, at 11:21 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>> Hi all,
>>
>> Please review this patch to create a Semaphore utility class. I need this class to implementing faster GC thread synchronization [1].
>>
>> http://cr.openjdk.java.net/~stefank/8087322/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8087322
>>
>> Some of our platforms already implement a Semaphore class, but those classes are hidden inside os_<os>.cpp. I've moved the class declaration to semaphore.hpp, but the implementation is left in the os_<os>.hpp. I considered creating semaphore_<os>.cpp files, but I ended up having to restructure too much code and I wanted to focus on the feature in [1] instead. Should I create a RFE to move the semaphore implementations?
> Please file an RFE for that.

Will do.

>
> Also, see my later comment about Solaris and perhaps having a shared POSIX implementation of this facility.
>
>> There seems to be another opportunity to cleanup the code. If we take os_linux.cpp, as an example, the code uses both the existing Semaphore class and the global ::sem_wait and ::sem_post functions. We might want to consider unifying that code.
> Please file an RFE for that.

Will do.

>
>> Since HotSpot is built on platforms that I don't have access to and can't test, I've added the IMPLEMENTS_SEMAPHORE_CLASS define. So, that I can detect if the the current platform implements the Semaphore class, and choose what synchronization primitive to use by the GC.
> Ugh!  Another reason to provide a POSIX-based implementation, as many other platforms could probably trivially provide support via that.

Sounds good to me, but I'll defer this to someone else.

>
> Looking ahead, I see that another reason to keep the old synchronization code would be to provide a diagnostic fallback, but I would hope that eventually this would all get cleaned up and the non-semaphore-based approach would go away.

I agree.

>
> On to the more detailed comments.  Where I've noted a pre-existing
> defect, I'd generally be fine with an RFE to fix later if you really
> don't want to spend the time on it right now.
>
> ------------------------------------------------------------------------------
> src/os/windows/vm/os_windows.cpp
>
> I didn't do a careful review of the Windows implementation - I'm not
> sufficiently well versed in low-level Windows development to feel like
> I can do an adequate job here.  I didn't see anything obviously
> concerning in what's implemented though.

Hopefully, someone else can review that code.

>
> Is the present lack of implementations for trywait and timedwait due
> to lack of need and so not bothering right now?  Or is there some
> fundamental difficulty?

The former.

>
> I'm wondering if it might be better to leave the unimplemented
> functions truely unimplemented, e.g. don't define them at all.  That
> way, if there is ever an attempt to access them, we should presumably
> get link-time errors, which would be better than runtime errors.

Sounds good to me. I'll change that.

>
> ------------------------------------------------------------------------------
> src/os/solaris/vm/os_solaris.cpp
>
> Probably not something to address in this change set, but I wonder why
> Solaris is using sema_xxx rather than using a shared POSIX sem_xxx
> implementation.  Such a POSIX implementation could probably be shared
> by Linux, (non-Apple) BSD, and Solaris, and perhaps others not
> maintained by Oracle.
>
> ------------------------------------------------------------------------------
> src/os/solaris/vm/os_solaris.cpp
> 2270 static int sem_max_value = -1;
> ...
> 2272 Semaphore::Semaphore(uint value, uint max) {
> 2273   guarantee(value <= max, "value lower than max");
> 2274
> 2275   static long sem_value_max = sysconf(_SC_SEM_VALUE_MAX);
> 2276   guarantee(max == Semaphore::NoMaxCount || value <= sem_max_value,
> 2277       err_msg("Max value: %u set higher than SEM_VALUE_MAX: %l", sem_value_max));
>
> There's a bit of a mess around sem_max_value and sem_value_max.
>
> sem_max_value is used in the comparison.  Since it's initialized to -1
> and never set anywhere, its int -1 is converted to uint max when
> comparing with the uint "value" argument.
>
> sem_value_max is obtained from sysconf and used in the error message,
> but not compared against the "value" argument.

My intention was to only use the sem_value_max, but I messed it up. I'll 
fix this.

>
> ------------------------------------------------------------------------------
> src/os/solaris/vm/os_solaris.cpp
> 2288 void Semaphore::signal() {
> 2289   int ret = sema_post(&_semaphore);
> 2290
> 2291   assert(ret == 0, err_msg("Semaphore signal failed: %d", errno));
> 2292 }
>
> sema_post is documented as returning the error code directly, rather
> than "returning" it in errno.  Fixing that would also fix the
> presently assigned but never used state of ret in non-debug builds.

You might be right, and I'll probably need to write a test that 
overflows the semaphore, but the man page says:
---

        RETURN VALUES

    Upon successful completion, 0 is returned; otherwise, a non-zero
    value indicates an error.

...

The *sema_trywait()* function will fail if::

EBUSY

    The semaphore pointed to by sp has a zero count.

The *sema_post()* function will fail if:

EOVERFLOW

    The semaphore value pointed to by sp exceeds SEM_VALUE_MAX .

...

              if (sema_trywait(&tellers) != 0)
                    if (errno == EAGAIN){ /* no teller available */

---

So, according to the example the error number is returned in errno, AFAICT.

>
> ------------------------------------------------------------------------------
> src/os/solaris/vm/os_solaris.cpp
> 2300 void Semaphore::wait() {
> 2301   int ret;
> 2302
> 2303   do {
> 2304     ret = sema_wait(&_semaphore);
> 2305     // Retry if the wait was interrupted by a signal.
> 2306   } while (errno == EINTR);
> 2307
> 2308   assert(ret == 0, err_msg("Semaphore wait failed: %d", errno));
> 2309 }
>
> sema_wait is documented as returning the error code directly, rather
> than "returning" it in errno.  Fixing that would also fix the
> presently assigned but never used state of ret in non-debug builds.
>
> ------------------------------------------------------------------------------
> src/os/solaris/vm/os_solaris.cpp
> 2311 bool Semaphore::trywait() {
> ...
> 2315 bool Semaphore::timedwait(unsigned int sec, int nsec) {
>
> Pre-existing defect: These functions silently swallow various errors,
> such as EINVAL.  It seems like they should be checking for those in at
> least debug builds.
>
> ------------------------------------------------------------------------------
> src/os/solaris/vm/os_solaris.cpp
> 2284 Semaphore::~Semaphore() {
> 2285   sema_destroy(&_semaphore);
> 2286 }
>
> Pre-existing defect: sema_destroy result not checked.
>
> ------------------------------------------------------------------------------
> src/os/bsd/vm/os_bsd.cpp
>
> Pre-existing defect: The semaphore implementation provided here is
> very weak on error checking.

I'll open an new RFE regarding the pre-existing defects.

>
> ------------------------------------------------------------------------------
> src/os/bsd/vm/os_bsd.cpp
> 1984 static jlong semaphore_currenttime() {
>
> Now that this function is file-scoped, it is clearly only used by the
> __APPLE__ implementation, so should be moved inside the nearby #ifdef.

I'll move the code.

>
> ------------------------------------------------------------------------------
> src/share/vm/utilities/semaphore.hpp
>
> The Semaphore class should not be copyable or copy-assignable.

The same goes for Monitor and Mutex, right? I'll fix this for Semaphore.

>
> ------------------------------------------------------------------------------
>

Thanks for reviewing!
StefanK