RFR: 8191786: Thread-SMR hash table size should be dynamic [v2]

Erik Österlund eosterlund at openjdk.java.net
Tue May 25 21:55:14 UTC 2021


On Tue, 25 May 2021 20:31:13 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> This is where we need @fisk. This comment and the associated code was
>> written by @fisk in the early days of the Thread-SMR project. So we need
>> Erik to go spelunking in the recesses of his memory for why he sized things
>> this way...
>> 
>> Also, I think the comment is a bit wrong and doesn't match the code below.
>
> The theory in the code for KVHashtable or BasicHashtable is that it works better for prime numbers.  It'll resize to other prime numbers if needed.  I assume starting with 1007 should work fine, if the number of entries matches the number of threads.
> 
> const int _small_table_sizes[] = { 107, 1009, 2017, 4049, 5051, 10103, 20201, 40423 } ;
> These are the resizing sizes in hashtable.cpp.
> 
> Edit: You could use this table to choose an initial size based on your current number of threads.

> This is where we need @fisk. This comment and the associated code was
> 
> written by @fisk in the early days of the Thread-SMR project. So we need
> 
> Erik to go spelunking in the recesses of his memory for why he sized things
> 
> this way...
> 
> 
> 
> Also, I think the comment is a bit wrong and doesn't match the code below.

I have a habit of sizing hash tables in this way to achieve the following:
* The power of two table size means you can map to a bucket using hash & (size - 1), instead of the much more expensive hash % size. Modulo is several orders of magnitude more expensive to execute.
* Make sure the average number of entries per bucket is not exceeding 1.

Having said that, the first of those two reasons no longer applies; the more generic table logic we use nowadays uses modulo based bucket mapping I think. I probably had a more manually crafted table earlier where I used that property. It was quite a while ago, so I don't remember what happened with that. If I had to guess, I'd think @coleenp might have encouraged me to not implement yet another hash table, as we already have too many in general.

>> We need @fisk to time in here with his rationale from the beginning of the Thread-SMR project.
>> 
>> The 1031 came from me and I stole it from some other fixed hash table size in the VM.
>> 
>> Someone please correct me if I'm wrong, but doesn't this hash_table_size value max out
>> at 64? `MIN2((int)threads->length(), 32)` => `32` when there are more than 32 threads
>> and `32 << 1` => `64`.
>> 
>> Yeah, we really need @fisk to chime in here.
>> 
>> Also, don't forget that these hash_tables are very, very short lived
>> since they only exist while we are gathering either hazard ptrs or
>> JavaThread* protected by hazard ptrs.
>
> Yes, 32 doesn't seem right.

> We need @fisk to time in here with his rationale from the beginning of the Thread-SMR project.
> 
> 
> 
> The 1031 came from me and I stole it from some other fixed hash table size in the VM.
> 
> 
> 
> Someone please correct me if I'm wrong, but doesn't this hash_table_size value max out
> 
> at 64? `MIN2((int)threads->length(), 32)` => `32` when there are more than 32 threads
> 
> and `32 << 1` => `64`.
> 
> 
> 
> Yeah, we really need @fisk to chime in here.
> 
> 
> 
> Also, don't forget that these hash_tables are very, very short lived
> 
> since they only exist while we are gathering either hazard ptrs or
> 
> JavaThread* protected by hazard ptrs.
> 
> 

I don't recognize the arbitrary 32 limit. If I did indeed write that (I have no recollection), then the only reason for it that I can think of is that hazard pointers are (or were?) very infrequent to encounter and that most of the time, we end up adding no entries at all to the table. And that it would be somehow wasteful to allocate large tables only to not put anything into it.

But yeah, wouldn't mind getting rid of this arbitrary heuristic limit. I don't think it matters today.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4168


More information about the hotspot-runtime-dev mailing list