From cwhite102 at gmail.com Fri May 25 04:57:36 2018 From: cwhite102 at gmail.com (Craig White) Date: Fri, 25 May 2018 00:57:36 -0400 Subject: Bug and patch - Heavy device io can hang Message-ID: <006b01d3f3e4$e6bb1630$b4314290$@com> I'm not sure if this library is being maintained or not, but I found it was quite useful to get some boards going with a pi! I ran into a hang situation with heavy writes to an I2C device, and traced the issue to improper mutex/condition code in dio_common.cpp. It was very reproducible and evident that the waitForCondition method could miss the signal from generateSignal() and just get stuck there waiting. Craig Here's a patch to fix the issue: @@ -279,4 +279,10 @@ if (sig != NULL) { javautil_list_add(signals, sig); + // Need to lock the mutex associated with the condition + // otherwise we could miss the signal + // This is done while holding the 'master' signalMutex + javacall_os_mutex_lock(javacall_os_cond_get_mutex(sig->condition)); + + // Now release the master lock javacall_os_mutex_unlock(signalMutex); @@ -282,2 +288,4 @@ javacall_os_mutex_unlock(signalMutex); + + // Wait on the condition (releases the condition mutex while waiting) javacall_os_cond_wait(sig->condition, timeout); @@ -283,4 +291,9 @@ javacall_os_cond_wait(sig->condition, timeout); + // Technically, this should be in a loop for spurious wakes, but + // I'm not familiar enough with the code for what the loop would check + + // Unlock the condition mutex + javacall_os_mutex_unlock(javacall_os_cond_get_mutex(sig->condition)); } } else { javacall_os_mutex_unlock(signalMutex); @@ -313,4 +326,9 @@ signal* sig = findTarget(signalType, signalTarget); if (sig != NULL) { sig->parameter = signalParameter; + + // Need to lock the mutex associated with the condition + // or waitForSignal could miss the signal + javacall_os_mutex_lock( javacall_os_cond_get_mutex(sig->condition) ); + javacall_os_cond_signal(sig->condition); @@ -316,4 +334,6 @@ javacall_os_cond_signal(sig->condition); + + javacall_os_mutex_unlock( javacall_os_cond_get_mutex(sig->condition) ); } else { // signal is being sent before wait started // create the signal and add it to the list From snazarkin at azul.com Fri May 25 15:32:48 2018 From: snazarkin at azul.com (Sergey Nazarkin) Date: Fri, 25 May 2018 15:32:48 +0000 Subject: Bug and patch - Heavy device io can hang In-Reply-To: <006b01d3f3e4$e6bb1630$b4314290$@com> References: <006b01d3f3e4$e6bb1630$b4314290$@com> Message-ID: <53343F75-5CCD-4914-9836-76643B59611C@azul.com> Hi, Craig! Thanks for the fix. I?ll take a look on Monday Sergey Nazarkin > On 25 May 2018, at 07:57, Craig White wrote: > > I'm not sure if this library is being maintained or not, but I found it was > quite useful to get some boards going with a pi! > > > > I ran into a hang situation with heavy writes to an I2C device, and traced > the issue to improper mutex/condition code in dio_common.cpp. > > It was very reproducible and evident that the waitForCondition method could > miss the signal from generateSignal() and just get stuck there waiting. > > > > Craig > > > > Here's a patch to fix the issue: > > > > > > @@ -279,4 +279,10 @@ > > if (sig != NULL) { > > javautil_list_add(signals, sig); > > + // Need to lock the mutex associated with the condition > > + // otherwise we could miss the signal > > + // This is done while holding the 'master' signalMutex > > + > javacall_os_mutex_lock(javacall_os_cond_get_mutex(sig->condition)); > > + > > + // Now release the master lock > > javacall_os_mutex_unlock(signalMutex); > > @@ -282,2 +288,4 @@ > > javacall_os_mutex_unlock(signalMutex); > > + > > + // Wait on the condition (releases the condition mutex while > waiting) > > javacall_os_cond_wait(sig->condition, timeout); > > @@ -283,4 +291,9 @@ > > javacall_os_cond_wait(sig->condition, timeout); > > + // Technically, this should be in a loop for spurious wakes, > but > > + // I'm not familiar enough with the code for what the loop > would check > > + > > + // Unlock the condition mutex > > + > javacall_os_mutex_unlock(javacall_os_cond_get_mutex(sig->condition)); > > } > > } else { > > javacall_os_mutex_unlock(signalMutex); > > @@ -313,4 +326,9 @@ > > signal* sig = findTarget(signalType, signalTarget); > > if (sig != NULL) { > > sig->parameter = signalParameter; > > + > > + // Need to lock the mutex associated with the condition > > + // or waitForSignal could miss the signal > > + javacall_os_mutex_lock( javacall_os_cond_get_mutex(sig->condition) > ); > > + > > javacall_os_cond_signal(sig->condition); > > @@ -316,4 +334,6 @@ > > javacall_os_cond_signal(sig->condition); > > + > > + javacall_os_mutex_unlock( > javacall_os_cond_get_mutex(sig->condition) ); > > } else { > > // signal is being sent before wait started > > // create the signal and add it to the list > > From snazarkin at azul.com Thu May 31 14:06:47 2018 From: snazarkin at azul.com (Sergey Nazarkin) Date: Thu, 31 May 2018 14:06:47 +0000 Subject: Bug and patch - Heavy device io can hang In-Reply-To: <53343F75-5CCD-4914-9836-76643B59611C@azul.com> References: <006b01d3f3e4$e6bb1630$b4314290$@com> <53343F75-5CCD-4914-9836-76643B59611C@azul.com> Message-ID: Craig, The patch looks OK (at least for the moment, the whole signal/wait subsystem requires refactoring). Have you signed Oracle Contributed Agreement (http://openjdk.java.net/contribute/)? Sergey Nazarkin > On 25 May 2018, at 18:32, Sergey Nazarkin wrote: > > Hi, Craig! > > Thanks for the fix. I?ll take a look on Monday > > > Sergey Nazarkin > > > > >> On 25 May 2018, at 07:57, Craig White wrote: >> >> I'm not sure if this library is being maintained or not, but I found it was >> quite useful to get some boards going with a pi! >> >> >> >> I ran into a hang situation with heavy writes to an I2C device, and traced >> the issue to improper mutex/condition code in dio_common.cpp. >> >> It was very reproducible and evident that the waitForCondition method could >> miss the signal from generateSignal() and just get stuck there waiting. >> >> >> >> Craig >> >> >> >> Here's a patch to fix the issue: >> >> >> >> >> >> @@ -279,4 +279,10 @@ >> >> if (sig != NULL) { >> >> javautil_list_add(signals, sig); >> >> + // Need to lock the mutex associated with the condition >> >> + // otherwise we could miss the signal >> >> + // This is done while holding the 'master' signalMutex >> >> + >> javacall_os_mutex_lock(javacall_os_cond_get_mutex(sig->condition)); >> >> + >> >> + // Now release the master lock >> >> javacall_os_mutex_unlock(signalMutex); >> >> @@ -282,2 +288,4 @@ >> >> javacall_os_mutex_unlock(signalMutex); >> >> + >> >> + // Wait on the condition (releases the condition mutex while >> waiting) >> >> javacall_os_cond_wait(sig->condition, timeout); >> >> @@ -283,4 +291,9 @@ >> >> javacall_os_cond_wait(sig->condition, timeout); >> >> + // Technically, this should be in a loop for spurious wakes, >> but >> >> + // I'm not familiar enough with the code for what the loop >> would check >> >> + >> >> + // Unlock the condition mutex >> >> + >> javacall_os_mutex_unlock(javacall_os_cond_get_mutex(sig->condition)); >> >> } >> >> } else { >> >> javacall_os_mutex_unlock(signalMutex); >> >> @@ -313,4 +326,9 @@ >> >> signal* sig = findTarget(signalType, signalTarget); >> >> if (sig != NULL) { >> >> sig->parameter = signalParameter; >> >> + >> >> + // Need to lock the mutex associated with the condition >> >> + // or waitForSignal could miss the signal >> >> + javacall_os_mutex_lock( javacall_os_cond_get_mutex(sig->condition) >> ); >> >> + >> >> javacall_os_cond_signal(sig->condition); >> >> @@ -316,4 +334,6 @@ >> >> javacall_os_cond_signal(sig->condition); >> >> + >> >> + javacall_os_mutex_unlock( >> javacall_os_cond_get_mutex(sig->condition) ); >> >> } else { >> >> // signal is being sent before wait started >> >> // create the signal and add it to the list >> >> > From cwhite102 at gmail.com Thu May 31 14:18:37 2018 From: cwhite102 at gmail.com (Craig White) Date: Thu, 31 May 2018 10:18:37 -0400 Subject: Bug and patch - Heavy device io can hang In-Reply-To: References: <006b01d3f3e4$e6bb1630$b4314290$@com> <53343F75-5CCD-4914-9836-76643B59611C@azul.com> Message-ID: <013f01d3f8ea$44f10c00$ced32400$@com> Thanks Sergey. I haven't signed the contributor agreement yet. Do you need this to accept the patch? I usually just use my co-worker (who has some level of contributor status) as the interface to Oracle/OpenJDK when we find issues. Craig -----Original Message----- From: Sergey Nazarkin [mailto:snazarkin at azul.com] Sent: Thursday, May 31, 2018 10:07 AM To: Craig White Cc: dio-dev at openjdk.java.net Subject: Re: Bug and patch - Heavy device io can hang Craig, The patch looks OK (at least for the moment, the whole signal/wait subsystem requires refactoring). Have you signed Oracle Contributed Agreement (http://openjdk.java.net/contribute/)? Sergey Nazarkin > On 25 May 2018, at 18:32, Sergey Nazarkin wrote: > > Hi, Craig! > > Thanks for the fix. I?ll take a look on Monday > > > Sergey Nazarkin > > > > >> On 25 May 2018, at 07:57, Craig White wrote: >> >> I'm not sure if this library is being maintained or not, but I found it was >> quite useful to get some boards going with a pi! >> >> >> >> I ran into a hang situation with heavy writes to an I2C device, and traced >> the issue to improper mutex/condition code in dio_common.cpp. >> >> It was very reproducible and evident that the waitForCondition method could >> miss the signal from generateSignal() and just get stuck there waiting. >> >> >> >> Craig >> >> >> >> Here's a patch to fix the issue: >> >> >> >> >> >> @@ -279,4 +279,10 @@ >> >> if (sig != NULL) { >> >> javautil_list_add(signals, sig); >> >> + // Need to lock the mutex associated with the condition >> >> + // otherwise we could miss the signal >> >> + // This is done while holding the 'master' signalMutex >> >> + >> javacall_os_mutex_lock(javacall_os_cond_get_mutex(sig->condition)); >> >> + >> >> + // Now release the master lock >> >> javacall_os_mutex_unlock(signalMutex); >> >> @@ -282,2 +288,4 @@ >> >> javacall_os_mutex_unlock(signalMutex); >> >> + >> >> + // Wait on the condition (releases the condition mutex while >> waiting) >> >> javacall_os_cond_wait(sig->condition, timeout); >> >> @@ -283,4 +291,9 @@ >> >> javacall_os_cond_wait(sig->condition, timeout); >> >> + // Technically, this should be in a loop for spurious wakes, >> but >> >> + // I'm not familiar enough with the code for what the loop >> would check >> >> + >> >> + // Unlock the condition mutex >> >> + >> javacall_os_mutex_unlock(javacall_os_cond_get_mutex(sig->condition)); >> >> } >> >> } else { >> >> javacall_os_mutex_unlock(signalMutex); >> >> @@ -313,4 +326,9 @@ >> >> signal* sig = findTarget(signalType, signalTarget); >> >> if (sig != NULL) { >> >> sig->parameter = signalParameter; >> >> + >> >> + // Need to lock the mutex associated with the condition >> >> + // or waitForSignal could miss the signal >> >> + javacall_os_mutex_lock( javacall_os_cond_get_mutex(sig->condition) >> ); >> >> + >> >> javacall_os_cond_signal(sig->condition); >> >> @@ -316,4 +334,6 @@ >> >> javacall_os_cond_signal(sig->condition); >> >> + >> >> + javacall_os_mutex_unlock( >> javacall_os_cond_get_mutex(sig->condition) ); >> >> } else { >> >> // signal is being sent before wait started >> >> // create the signal and add it to the list >> >> > From snazarkin at azul.com Thu May 31 14:24:40 2018 From: snazarkin at azul.com (Sergey Nazarkin) Date: Thu, 31 May 2018 14:24:40 +0000 Subject: Bug and patch - Heavy device io can hang In-Reply-To: <013f01d3f8ea$44f10c00$ced32400$@com> References: <006b01d3f3e4$e6bb1630$b4314290$@com> <53343F75-5CCD-4914-9836-76643B59611C@azul.com> <013f01d3f8ea$44f10c00$ced32400$@com> Message-ID: <7F1BDF0B-7914-4665-A918-B098D323C975@azul.com> To accept the patch I need your (or your employer) OCA. Are you individual contributor or this is part of job activity? Sergey Nazarkin > On 31 May 2018, at 17:18, Craig White wrote: > > Thanks Sergey. > I haven't signed the contributor agreement yet. Do you need this to accept the patch? > I usually just use my co-worker (who has some level of contributor status) as the interface to Oracle/OpenJDK when we find issues. > Craig > > > -----Original Message----- > From: Sergey Nazarkin [mailto:snazarkin at azul.com] > Sent: Thursday, May 31, 2018 10:07 AM > To: Craig White > Cc: dio-dev at openjdk.java.net > Subject: Re: Bug and patch - Heavy device io can hang > > Craig, > > The patch looks OK (at least for the moment, the whole signal/wait subsystem requires refactoring). Have you signed Oracle Contributed Agreement (http://openjdk.java.net/contribute/)? > > > Sergey Nazarkin > > > > >> On 25 May 2018, at 18:32, Sergey Nazarkin wrote: >> >> Hi, Craig! >> >> Thanks for the fix. I?ll take a look on Monday >> >> >> Sergey Nazarkin >> >> >> >> >>> On 25 May 2018, at 07:57, Craig White wrote: >>> >>> I'm not sure if this library is being maintained or not, but I found it was >>> quite useful to get some boards going with a pi! >>> >>> >>> >>> I ran into a hang situation with heavy writes to an I2C device, and traced >>> the issue to improper mutex/condition code in dio_common.cpp. >>> >>> It was very reproducible and evident that the waitForCondition method could >>> miss the signal from generateSignal() and just get stuck there waiting. >>> >>> >>> >>> Craig >>> >>> >>> >>> Here's a patch to fix the issue: >>> >>> >>> >>> >>> >>> @@ -279,4 +279,10 @@ >>> >>> if (sig != NULL) { >>> >>> javautil_list_add(signals, sig); >>> >>> + // Need to lock the mutex associated with the condition >>> >>> + // otherwise we could miss the signal >>> >>> + // This is done while holding the 'master' signalMutex >>> >>> + >>> javacall_os_mutex_lock(javacall_os_cond_get_mutex(sig->condition)); >>> >>> + >>> >>> + // Now release the master lock >>> >>> javacall_os_mutex_unlock(signalMutex); >>> >>> @@ -282,2 +288,4 @@ >>> >>> javacall_os_mutex_unlock(signalMutex); >>> >>> + >>> >>> + // Wait on the condition (releases the condition mutex while >>> waiting) >>> >>> javacall_os_cond_wait(sig->condition, timeout); >>> >>> @@ -283,4 +291,9 @@ >>> >>> javacall_os_cond_wait(sig->condition, timeout); >>> >>> + // Technically, this should be in a loop for spurious wakes, >>> but >>> >>> + // I'm not familiar enough with the code for what the loop >>> would check >>> >>> + >>> >>> + // Unlock the condition mutex >>> >>> + >>> javacall_os_mutex_unlock(javacall_os_cond_get_mutex(sig->condition)); >>> >>> } >>> >>> } else { >>> >>> javacall_os_mutex_unlock(signalMutex); >>> >>> @@ -313,4 +326,9 @@ >>> >>> signal* sig = findTarget(signalType, signalTarget); >>> >>> if (sig != NULL) { >>> >>> sig->parameter = signalParameter; >>> >>> + >>> >>> + // Need to lock the mutex associated with the condition >>> >>> + // or waitForSignal could miss the signal >>> >>> + javacall_os_mutex_lock( javacall_os_cond_get_mutex(sig->condition) >>> ); >>> >>> + >>> >>> javacall_os_cond_signal(sig->condition); >>> >>> @@ -316,4 +334,6 @@ >>> >>> javacall_os_cond_signal(sig->condition); >>> >>> + >>> >>> + javacall_os_mutex_unlock( >>> javacall_os_cond_get_mutex(sig->condition) ); >>> >>> } else { >>> >>> // signal is being sent before wait started >>> >>> // create the signal and add it to the list >>> >>> >> > From cwhite102 at gmail.com Thu May 31 21:18:48 2018 From: cwhite102 at gmail.com (Craig White) Date: Thu, 31 May 2018 17:18:48 -0400 Subject: Bug and patch - Heavy device io can hang In-Reply-To: <7F1BDF0B-7914-4665-A918-B098D323C975@azul.com> References: <006b01d3f3e4$e6bb1630$b4314290$@com> <53343F75-5CCD-4914-9836-76643B59611C@azul.com> <013f01d3f8ea$44f10c00$ced32400$@com> <7F1BDF0B-7914-4665-A918-B098D323C975@azul.com> Message-ID: I've just emailed the signed OCA to Oracle as an individual. Craig > On May 31, 2018, at 10:24 AM, Sergey Nazarkin wrote: > > To accept the patch I need your (or your employer) OCA. Are you individual contributor or this is part of job activity? > > > Sergey Nazarkin > > > > >> On 31 May 2018, at 17:18, Craig White wrote: >> >> Thanks Sergey. >> I haven't signed the contributor agreement yet. Do you need this to accept the patch? >> I usually just use my co-worker (who has some level of contributor status) as the interface to Oracle/OpenJDK when we find issues. >> Craig >> >> >> -----Original Message----- >> From: Sergey Nazarkin [mailto:snazarkin at azul.com] >> Sent: Thursday, May 31, 2018 10:07 AM >> To: Craig White >> Cc: dio-dev at openjdk.java.net >> Subject: Re: Bug and patch - Heavy device io can hang >> >> Craig, >> >> The patch looks OK (at least for the moment, the whole signal/wait subsystem requires refactoring). Have you signed Oracle Contributed Agreement (http://openjdk.java.net/contribute/)? >> >> >> Sergey Nazarkin >> >> >> >> >>> On 25 May 2018, at 18:32, Sergey Nazarkin wrote: >>> >>> Hi, Craig! >>> >>> Thanks for the fix. I?ll take a look on Monday >>> >>> >>> Sergey Nazarkin >>> >>> >>> >>> >>>> On 25 May 2018, at 07:57, Craig White wrote: >>>> >>>> I'm not sure if this library is being maintained or not, but I found it was >>>> quite useful to get some boards going with a pi! >>>> >>>> >>>> >>>> I ran into a hang situation with heavy writes to an I2C device, and traced >>>> the issue to improper mutex/condition code in dio_common.cpp. >>>> >>>> It was very reproducible and evident that the waitForCondition method could >>>> miss the signal from generateSignal() and just get stuck there waiting. >>>> >>>> >>>> >>>> Craig >>>> >>>> >>>> >>>> Here's a patch to fix the issue: >>>> >>>> >>>> >>>> >>>> >>>> @@ -279,4 +279,10 @@ >>>> >>>> if (sig != NULL) { >>>> >>>> javautil_list_add(signals, sig); >>>> >>>> + // Need to lock the mutex associated with the condition >>>> >>>> + // otherwise we could miss the signal >>>> >>>> + // This is done while holding the 'master' signalMutex >>>> >>>> + >>>> javacall_os_mutex_lock(javacall_os_cond_get_mutex(sig->condition)); >>>> >>>> + >>>> >>>> + // Now release the master lock >>>> >>>> javacall_os_mutex_unlock(signalMutex); >>>> >>>> @@ -282,2 +288,4 @@ >>>> >>>> javacall_os_mutex_unlock(signalMutex); >>>> >>>> + >>>> >>>> + // Wait on the condition (releases the condition mutex while >>>> waiting) >>>> >>>> javacall_os_cond_wait(sig->condition, timeout); >>>> >>>> @@ -283,4 +291,9 @@ >>>> >>>> javacall_os_cond_wait(sig->condition, timeout); >>>> >>>> + // Technically, this should be in a loop for spurious wakes, >>>> but >>>> >>>> + // I'm not familiar enough with the code for what the loop >>>> would check >>>> >>>> + >>>> >>>> + // Unlock the condition mutex >>>> >>>> + >>>> javacall_os_mutex_unlock(javacall_os_cond_get_mutex(sig->condition)); >>>> >>>> } >>>> >>>> } else { >>>> >>>> javacall_os_mutex_unlock(signalMutex); >>>> >>>> @@ -313,4 +326,9 @@ >>>> >>>> signal* sig = findTarget(signalType, signalTarget); >>>> >>>> if (sig != NULL) { >>>> >>>> sig->parameter = signalParameter; >>>> >>>> + >>>> >>>> + // Need to lock the mutex associated with the condition >>>> >>>> + // or waitForSignal could miss the signal >>>> >>>> + javacall_os_mutex_lock( javacall_os_cond_get_mutex(sig->condition) >>>> ); >>>> >>>> + >>>> >>>> javacall_os_cond_signal(sig->condition); >>>> >>>> @@ -316,4 +334,6 @@ >>>> >>>> javacall_os_cond_signal(sig->condition); >>>> >>>> + >>>> >>>> + javacall_os_mutex_unlock( >>>> javacall_os_cond_get_mutex(sig->condition) ); >>>> >>>> } else { >>>> >>>> // signal is being sent before wait started >>>> >>>> // create the signal and add it to the list >>>> >>>> >>> >> >