RFR: 8231666: ThreadIdTable::grow() invokes invalid thread transition

Wed Oct 2 09:58:15 UTC 2019

Hi David,

> What if the table is full and must be grown?

The table uses chaining, it just means load factor tip over what is considered a
good backing array size.

> 
> That aside, I'd like to know how expensive it is to grow this table. What are we 
> talking about here?

We use global counter which on write_synchronize must scan all threads to make
sure they have seen the update (there some optimization to avoid it if there is
no readers at all). Since this table contains the threads, we get double
penalized, for each new thread the synchronization cost increase AND the number
of items.

With concurrent reads you still need many thousands of threads, but I think I
saw someone mentioning 100k threads, assuming concurrent queries the resize can
take hundreds of ms to finish.
Note that reads and inserts still in operate roughly at the same speed while
resizing. So a longer resize is only problematic if we do not respect
safepoints.

Thanks, Robbin

> 
> David
> 
>> /Robbin
>>
>> On 2019-10-02 08:46, David Holmes wrote:
>>> Hi Daniil,
>>>
>>> On 2/10/2019 4:13 pm, Daniil Titov wrote:
>>>> Please review a change that fixes the issue. The problem here is that that 
>>>> the thread is added to the ThreadIdTable  (introduced in [3]) while the 
>>>> Threads_lock is held by
>>>> JVM_StartThread. When new thread is added  to the thread table the table 
>>>> checks if its load factor is greater than required and if so it grows itself 
>>>> while polling for safepoints.
>>>> After changes [4]  an attempt to block the thread while holding the 
>>>> Threads_lock  results in assertion in Thread::check_possible_safepoint().
>>>>
>>>> The fix  proposed by David Holmes ( thank you, David!)  is to skip the 
>>>> ThreadBlockInVM inside ThreadIdTable::grow() method if the current thread 
>>>> owns the Threads_lock.
>>>
>>> Sorry but looking at the fix in context now I think it would be better to do 
>>> this:
>>>
>>>      while (gt.do_task(jt)) {
>>>        if (Threads_lock->owner() == jt) {
>>>          gt.pause(jt);
>>>          ThreadBlockInVM tbivm(jt);
>>>          gt.cont(jt);
>>>        }
>>>      }
>>>
>>> This way we don't waste time with the pause/cont when there's no safepoint 
>>> pause going to happen - and the owner() check is quicker than 
>>> owned_by_self(). That partially addresses a general concern I have about how 
>>> long it may take to grow the table, as we are deferring safepoints until it 
>>> is complete in this JVM_StartThread usecase.
>>>
>>> In the test you don't need all of:
>>>
>>>    32  * @run clean ThreadStartTest
>>>    33  * @run build ThreadStartTest
>>>    34  * @run main ThreadStartTest
>>>
>>> just the last @run suffices to build and run the test.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> Testing : Mach 5 tier1 and tier2 completed successfully, tier3 is in progress.
>>>>
>>>> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8231666/webrev.01/
>>>> [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8231666
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8185005
>>>> [4] https://bugs.openjdk.java.net/browse/JDK-8184732
>>>>
>>>> Best regards,
>>>> Danill
>>>>
>>>>