RFR: 8231666: ThreadIdTable::grow() invokes invalid thread transition

Wed Oct 2 13:25:48 UTC 2019

Hi Robbin,

On 2/10/2019 7:58 pm, Robbin Ehn wrote:
> Hi David,
> 
>> What if the table is full and must be grown?
> 
> The table uses chaining, it just means load factor tip over what is 
> considered a good backing array size.

Coleen raised a good question in a separate discussion, which made me 
realize that once the table has been initially populated all subsequent 
additions, and hence all subsequent calls to grow() always happen with 
the Threads_lock held. So we can't just defer the grow().

>> That aside, I'd like to know how expensive it is to grow this table. 
>> What are we talking about here?
> 
> We use global counter which on write_synchronize must scan all
> threads to make sure they have seen the update (there some
> optimization to avoid it if there is no readers at all). Since this
> table contains the threads, we get double penalized, for each new
> thread the synchronization cost increase AND the number of items.
> 
> With concurrent reads you still need many thousands of threads, but
> I think I saw someone mentioning 100k threads, assuming concurrent
> queries the resize can take hundreds of ms to finish. Note that reads
> and inserts still in operate roughly at the same speed while 
> resizing. So a longer resize is only problematic if we do not
> respect safepoints.
I think if anything were capable of running 100K threads we would be 
hitting far worse scalability bottlenecks than this. But this does seem 
problematic.

Thanks,
David
-----

> Thanks, Robbin
> 
>>
>> David
>>
>>> /Robbin
>>>
>>> On 2019-10-02 08:46, David Holmes wrote:
>>>> Hi Daniil,
>>>>
>>>> On 2/10/2019 4:13 pm, Daniil Titov wrote:
>>>>> Please review a change that fixes the issue. The problem here is 
>>>>> that that the thread is added to the ThreadIdTable  (introduced in 
>>>>> [3]) while the Threads_lock is held by
>>>>> JVM_StartThread. When new thread is added  to the thread table the 
>>>>> table checks if its load factor is greater than required and if so 
>>>>> it grows itself while polling for safepoints.
>>>>> After changes [4]  an attempt to block the thread while holding the 
>>>>> Threads_lock  results in assertion in 
>>>>> Thread::check_possible_safepoint().
>>>>>
>>>>> The fix  proposed by David Holmes ( thank you, David!)  is to skip 
>>>>> the ThreadBlockInVM inside ThreadIdTable::grow() method if the 
>>>>> current thread owns the Threads_lock.
>>>>
>>>> Sorry but looking at the fix in context now I think it would be 
>>>> better to do this:
>>>>
>>>>      while (gt.do_task(jt)) {
>>>>        if (Threads_lock->owner() == jt) {
>>>>          gt.pause(jt);
>>>>          ThreadBlockInVM tbivm(jt);
>>>>          gt.cont(jt);
>>>>        }
>>>>      }
>>>>
>>>> This way we don't waste time with the pause/cont when there's no 
>>>> safepoint pause going to happen - and the owner() check is quicker 
>>>> than owned_by_self(). That partially addresses a general concern I 
>>>> have about how long it may take to grow the table, as we are 
>>>> deferring safepoints until it is complete in this JVM_StartThread 
>>>> usecase.
>>>>
>>>> In the test you don't need all of:
>>>>
>>>>    32  * @run clean ThreadStartTest
>>>>    33  * @run build ThreadStartTest
>>>>    34  * @run main ThreadStartTest
>>>>
>>>> just the last @run suffices to build and run the test.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Testing : Mach 5 tier1 and tier2 completed successfully, tier3 is 
>>>>> in progress.
>>>>>
>>>>> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8231666/webrev.01/
>>>>> [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8231666
>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8185005
>>>>> [4] https://bugs.openjdk.java.net/browse/JDK-8184732
>>>>>
>>>>> Best regards,
>>>>> Danill
>>>>>
>>>>>