RFR: 8367981: Update CompactHashtable comments [v2]

Tue Sep 23 16:58:32 UTC 2025

On Thu, 18 Sep 2025 12:48:02 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you help to review this patch?
>> 
>> The comments of CompactHashtable and related classes are out of date, and some comments are unclear, wrong or misleading.
>> As the related classes are used in more and more scenarios, it's helpful to update the comments.
>> 
>> Thanks
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   remove calculate_header_size

Hi Hamlin, thanks for doing the clean up. I have some suggestions to for the comments, and some further code changes to improve readability.

src/hotspot/share/classfile/compactHashtable.hpp line 166:

> 164: // with the type in the left-most 2-bit and offset in the remaining 30-bit.
> 165: // The last entry is a special type. It contains the end of the last
> 166: // bucket.

These two lines should be deleted.

src/hotspot/share/classfile/compactHashtable.hpp line 170:

> 168: // There are three types of buckets, regular buckets and value_only buckets and table_end
> 169: // bucket. The value_only buckets have '01' in their highest 2-bit, and regular buckets have
> 170: // '00' in their highest 2-bit, and table_end bucket has '11' in its highest 2-bit.

Suggestion:

// There are three types of buckets: regular, value_only, and table_end.
// The regular buckets have '00' in their highest 2-bit.
// The value_only buckets have '01' in their highest 2-bit.
// There is only a single table_end bucket that marks the end of buckets[]. It has '11' in its highest 2-bit.

src/hotspot/share/classfile/compactHashtable.hpp line 174:

> 172: // For regular buckets, each entry is 8 bytes in the entries[]:
> 173: //   u4 hash;    // entry hash
> 174: //   u4 value;   // it could be offset or index or any payload.

I think `value` is confusing, as it's not the same as the template type `V`. I would suggest changing the declaration to 

class CompactHashtableWriter: public StackObj {
public:
  class Entry {
    unsigned int _hash;
-   u4 _value;
+   u4 _encoded_value;

All parameters of `offset` should be changed to `encoded_value`.

This comment should be updated:

u4 encoded_value; // A 32-bit encoding of the template type V. The template parameter DECODE
                  // converts this to type V. Many CompactHashtables encode a pointer as a 32-bit offset, where
                  //   V entry = (V)(base_address + offset)
                  // see StringTable, SymbolTable and AdapterHandlerLibrary for examples

I think we should consider a future clean up of adding an `ENCODE` template parameter so that the various table dumping functions can be consolidated.

src/hotspot/share/classfile/compactHashtable.hpp line 178:

> 176: //               //   YourEntryType* entry = (YourEntryType*)(base_address + offset)
> 177: //               // check StringTable, SymbolTable and AdapterHandlerLibrary for its usage examples.
> 178: // A regular bucket could also contains zero entry for empty bucket.

The line "// A regular bucket could also contains zero entry for empty bucket" should be deleted, replaced with explanation below.

src/hotspot/share/classfile/compactHashtable.hpp line 182:

> 180: // For value_only buckets, each entry has only the 4-byte 'value' in the entries[].
> 181: //
> 182: // For table_end bucket, there is no corresponding entry, i.e. it always contains zero entry.

Suggestion:

// The single table_end bucket has no corresponding entry.

src/hotspot/share/classfile/compactHashtable.hpp line 193:

> 191: //         |  +----+ +----+                 |
> 192: //         v       v v                      v
> 193: // entries[H,O,H,O,O,H,O,H,O.................]

Suggestion:

// The number of entries in bucket <i> can be calculated like this:
//      my_offset   = _buckets[i]   & 0x3fffffff; // mask off top 2-bit
//      next_offset = _buckets[i+1] & 0x3fffffff
//  For REGULAR_BUCKET_TYPE
//      num_entries = (next_offset - my_offset) / 8;
//  For VALUE_ONLY_BUCKET_TYPE
//      num_entries = (next_offset - my_offset) / 4;
//
// If bucket <i> is empty, we have my_offset == next_offset. Empty buckets are
// always encoded as regular buckets.
//
// In the following example:
//   - The 0-th bucket is a REGULAR_BUCKET_TYPE with two entries
//   - The 1-st bucket is a VALUE_ONLY_BUCKET_TYPE with one entry.
//   - The 2-th bucket is a REGULAR_BUCKET_TYPE with zeo entries.
//
// buckets[0, 4, 5(empty), 5, ...., N(table_end)]
//         |  |  |         |        |
//         |  |  +---+-----+        |
//         |  |      |              |
//         |  +----+ +              |
//         v       v v              v
// entries[H,O,H,O,O,H,O,H,O........]

-------------

Changes requested by iklam (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/27360#pullrequestreview-3258735478
PR Review Comment: https://git.openjdk.org/jdk/pull/27360#discussion_r2372852733
PR Review Comment: https://git.openjdk.org/jdk/pull/27360#discussion_r2372825872
PR Review Comment: https://git.openjdk.org/jdk/pull/27360#discussion_r2372931360
PR Review Comment: https://git.openjdk.org/jdk/pull/27360#discussion_r2372866326
PR Review Comment: https://git.openjdk.org/jdk/pull/27360#discussion_r2372831444
PR Review Comment: https://git.openjdk.org/jdk/pull/27360#discussion_r2372902511