RFR: 8301136: Improve unlink() and unlink_all() of ResourceHashtableBase [v5]

Sun Jan 29 06:44:16 UTC 2023

On Fri, 27 Jan 2023 23:16:52 GMT, Xin Liu <xliu at openjdk.org> wrote:

>> 1. Apply the same idea of JDK-8300184 to unlink().
>> 2. Because ResourceHashtableBase doesn't support copy assignment, client of it has to purge all elements first when it needs to assign it. We would like provide a specialized version called 'unlink_all()'.  We don't need to update each node's _next in this case. We only nullify all buckets. 
>> 3. This patch also provides a specialized version of unlink_all() for destructor. We don't even update buckets. it's dead anyway.
>
> Xin Liu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add lambda API unlink(Function f) per reviewers' request.

I updated [JDK-8301296](https://bugs.openjdk.org/browse/JDK-8301296) to clarify the problems with the current ResourceHashtable design and my proposal for fixing them.

I think the above proposed fixes shouldn't block the progress of this PR, which is just an optimization that maintains the current behavior. So let's continue the discussions here.

There are two parts of this PR:

[1] For the optimization (counting the number of entries and exit the loop early), do you have any performance data that shows this is beneficial?

For this optimization to be beneficial, we must have two conditions -- (a) the table is too large so it's likely to have many unused entries, and (b) the hash function is bad so most of the unused entries are at the end of the table.

For (a), maybe it's better to change the table to ResizeableResourceHashtable?
For (b), maybe you can also print out the occupancy of the table in your traces like this one (in your earlier PR https://github.com/openjdk/jdk/pull/12016)

[12.824s][debug][hashtables] table_size = 109, break at 68

If we have many entries (e.g., 40) but they all pack to the front end of the table, that means we have a bad hash function.

[2] As I suggested earlier, we should consolidate the code to use a single unlink loop, so you can apply this counting optimization in a single place. 

I am not quite sure why you would need the following in your "wrapper" functions:

if (clean) {
  *ptr = node->_next;
} else {
  ptr = &(node->_next);
}

and

if (node->_next == nullptr) {
  // nullify the bucket when reach the end of linked list.
  *ptr = nullptr;
}

I wrote a version of the consolidated loop that's much simpler. It also aligns with the old code so the diff is more readable:

https://github.com/openjdk/jdk/compare/master...iklam:jdk:8301136-resource-hash-unlink-all-suggestion

Note that I have both `unlink_all()` and `unlink_all(Function function)`, that's because the current API allows the user function to do two things (a) check if the entry should be freed, (b) perform special clean up on the entry. So if you want to free all entries but need to perform special clean up, you'd call  `unlink_all(Function function)`.

-------------

PR: https://git.openjdk.org/jdk/pull/12213