RFR: 8300184: Optimize ResourceHashtableBase::iterate_all using _number_of_entries [v2]

Tue Jan 17 07:21:10 UTC 2023

On Mon, 16 Jan 2023 23:24:23 GMT, Xin Liu <xliu at openjdk.org> wrote:

>> Current implementation needs to visit all buckets. In extreme case, an empty hashtable with 1k buckets. iterate_all() loads 1k buckets for nothing. `_number_of_entries` is the number of nodes in the hashtable. This patch leverages that to quit iteration earlier. The real effect is up to the distribution of nodes among buckets.
>> 
>> I also included a debug log in the original patch. It was used for the observation in the following section, but I deleted it because it introduces circular dependency. 
>> 
>> log_debug(hashtables)("ResourceHashtableBase table_size = %d, break at " INTX_FORMAT, sz, (bucket - table()));
>> 
>> **Evaluation** 
>> In release build, the patch is very marginal. class loading do use ResourceHashtable but they are saturated. 
>> 
>> I use it to run `Renaissance/finagle-http`. hahstable iterate_all quits a little bit earlier and save some loads. eg. in line 2, there are 109 buckets in the hashable. the iteration quit at 68th bucket, we save 42 loads. 
>> 
>> 
>> $java -Xlog:hashtables=debug  -jar renaissance-gpl-0.14.1.jar finagle-http
>> ...
>> [12.824s][debug][hashtables] table_size = 109, break at 109
>> [12.824s][debug][hashtables] table_size = 109, break at 68
>> [12.824s][debug][hashtables] table_size = 109, break at 107
>> [12.825s][debug][hashtables] table_size = 109, break at 108
>> [12.825s][debug][hashtables] table_size = 109, break at 99
>> [12.825s][debug][hashtables] table_size = 109, break at 108
>> [12.825s][debug][hashtables] table_size = 109, break at 98
>> [12.825s][debug][hashtables] table_size = 109, break at 109
>> [12.825s][debug][hashtables] table_size = 109, break at 102
>> ...
>> 
>> 
>> Another application is in AsyncLog Thread. We have a hashtable with 17 buckets. Each Log Output creates one node in the hashtable. In common cases, there are only 2~3 nodes in the hashtable. Each time AsyncLog Thread flushes the buffer, it has to scan the hashtable twice. Saving loads means saving the memory bandwidth for the application. 
>> 
>> In fastdebug build, extra verification is deployed. It's common to observe that JVM scan an empty hashtable. It's in destroy_VM. 
>>  
>> 
>> ./build/linux-x86_64-server-fastdebug/images/jdk/bin/java -Xlog:hashtables=debug --version
>> openjdk 21-internal 2023-09-19
>> OpenJDK Runtime Environment (fastdebug build 21-internal-adhoc.xxinliu.jdk)
>> OpenJDK 64-Bit Server VM (fastdebug build 21-internal-adhoc.xxinliu.jdk, mixed mode, sharing)
>> [0.130s][debug][hashtables] table_size = 139, break at 0
>> [0.136s][debug][hashtables] table_size = 107, break at 0
>> [0.136s][debug][hashtables] table_size = 1009, break at 0
>> [0.136s][debug][hashtables] table_size = 109, break at 107
>> [0.136s][debug][hashtables] table_size = 109, break at 109
>> [0.137s][debug][hashtables] table_size = 109, break at 101
>
> Xin Liu has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Remove log_debug.
>    
>    Thread::name() isn't safe in gtest. We don't need the log line.
>  - Fixed the build error on x86-32.

Hi, David, 

Thank you for reviewing this patch. 

> Nit: the bug synopsis says `iterate_all` but you actually changed `iterate`.

This is intentional. iterate_all() calls iterate() with a lambda '()->true'.  For iterate(), there are two cases. If the lambda returns constant value true, it mandates iterate() to traverse all elements. By contrast, user-defined lambdas could return false and stop iterating immediately. This patch only optimizes the former case. That's why I claim to optimize iterate_all rather than iterate.  

Sometimes, we need to traverse all elements. eg. we want to print out the contents, copy hashable or do statistics.

thanks,
--lx

-------------

PR: https://git.openjdk.org/jdk/pull/12016