Probably a bug

Kirill A. Korinsky kirill at korins.ky
Mon Feb 24 22:22:41 UTC 2020


After playing really a while I've created a very trivial test case: https://github.com/catap/shenandoah-akka-bug <https://github.com/catap/shenandoah-akka-bug>

It will exit from JVM with code 66 as soon as it got this bug.

-- 
wbr, Kirill

> On 24. Feb 2020, at 01:48, Kirill A. Korinsky <kirill at korins.ky> wrote:
> 
> I've played with -XX:ShenandoahVerifyLevel and I can't reproduce a bug with level 0, 1 and 2 but as soon as I increased it to 3 it appears.
> 
> Without any hs_err.
> 
> 
> -- 
> wbr, Kirill
> 
>> On 23. Feb 2020, at 16:01, Kirill A. Korinsky <kirill at korins.ky <mailto:kirill at korins.ky>> wrote:
>> 
>> I see that a new build is available at shipilev/openjdk-shenandoah.
>> 
>> I've tried the last:
>> 
>> Step 1/19 : FROM shipilev/openjdk-shenandoah:8-fastdebug
>> 8-fastdebug: Pulling from shipilev/openjdk-shenandoah
>> 6f2295d35e78: Pull complete
>> f9939b5dfdd6: Pull complete
>> 89e73a891426: Pull complete
>> Digest: sha256:530547249752996bb7a88ec4970b97d58d0c3c525c8ae52a3f1e24dffc6d547d
>> Status: Downloaded newer image for shipilev/openjdk-shenandoah:8-fastdebug
>>  ---> 09d8eb00cc65
>> 
>> and would like to confirm that the bug still exists.
>> 
>> -- 
>> wbr, Kirill
>> 
>>> On 17. Feb 2020, at 15:59, Kirill A. Korinsky <kirill at korins.ky <mailto:kirill at korins.ky>> wrote:
>>> 
>>> Good day,
>>> 
>>> I'd like to ask for advice because it looks like I've discovered something that might be related to Shenandoah bug.
>>> 
>>> I haven't got any proof that it is inside Shenandoah, nor a simple test case to reproduce it.
>>> 
>>> It appears inside Akka and you can read my hunting with Akka team here: https://github.com/akka/akka/issues/28601 <https://github.com/akka/akka/issues/28601>
>>> 
>>> As summary:
>>>  - it appears as infinite loop inside Akka queue that is lock-free linked-queue that's implemented via getObjectVolatile(), getAndSet() and few more atomic/unsafe calls.
>>>  - if I've enabled any debugging such as XX:+ShenandoahVerify the bug is disappear => I can't provide any hs_err_log :(
>>>  - it exists on OpenJDK-8 from fedora 31 and at shipilev/openjdk-shenandoah:8-fastdebug
>>>  - it is very difficult to achieve and it is very fragile. In real life, it appears only at one and bigger cluster, at my synthetic test case it requires to bootstrap an application and uses the unreachable Akka system
>>>  - to achieve this bug I should have a lot of garbage inside heap that produced by bootstrapping an application when it builds its index. The index has size 0,5gb..1gb (and the heap is 2gb) and the size depends on DB that is continuously updating, and the bug is achievable at any possibly size of the index.
>>>  - if I switch to G1 for example it disappears.
>>> 
>>> Right now I have two possible sources of this bug:
>>>  - very strange race condition inside Akka.
>>>  - a bug inside Shenandoah that is related to the missed barrier or deeper.
>>> 
>>> To eliminate or confirm Shenandoah related possibility I need some advice on how to do it because I can't prepare easy to reproduce code :(
>>> 
>>> -- 
>>> wbr, Kirill
>>> 
>> 
> 



More information about the shenandoah-dev mailing list