RFR: 8354300: Fields in String are not trusted

ExE Boss duke at openjdk.org
Sat Apr 19 06:30:51 UTC 2025


On Mon, 14 Apr 2025 18:09:35 GMT, Chen Liang <liach at openjdk.org> wrote:

>> Hello Per, I'm not too familiar with runtime compiler optimizations. So consider this as a basic question. 
>> 
>>> This means the VM can trust these fields to never change which enables constant folding optimizations.
>> 
>> If I'm not wrong, then it is the `hash` field value that we want to be considered as a constant (once computed) so that calls to `String.hashCode()` would get replaced with the constant computed value.
>> 
>> Looking at the current implementation of `String.hashCode()`:
>> 
>> 
>> public int hashCode() {
>>         // The hash or hashIsZero fields are subject to a benign data race,
>>         // making it crucial to ensure that any observable result of the
>>         // calculation in this method stays correct under any possible read of
>>         // these fields. Necessary restrictions to allow this to be correct
>>         // without explicit memory fences or similar concurrency primitives is
>>         // that we can ever only write to one of these two fields for a given
>>         // String instance, and that the computation is idempotent and derived
>>         // from immutable state
>>         int h = hash;
>>         if (h == 0 && !hashIsZero) {
>>             h = isLatin1() ? StringLatin1.hashCode(value)
>>                            : StringUTF16.hashCode(value);
>>             if (h == 0) {
>>                 hashIsZero = true;
>>             } else {
>>                 hash = h;
>>             }
>>         }
>>         return h;
>>     }
>> 
>> 
>> If I'm reading that correctly, and keeping aside concurrent calls from this discussion, then only one of `hash` or the `hashIsZero` fields will have its value changed to a non-default value. i.e. if `hashCode()` implementation computes a non-zero value then the `hash` field will be assigned a (non-default) value and if that method computes a hash of 0, then `hashIsZero` will get assigned a (non-default) value. It then means that the other field will never move out of its initial value and thus will never be considered "stable".
>> 
>> Am I right? If yes, then would the runtime (hotspot) compiler still replace the call to `String.hashCode()` with a constant value?
>
> Also re @jaikiran: yes, you are right that the current code cannot constant-fold the scenario where the hash is 0; so `"".hashCode()` is not constant as a result. The solution I shared above can address this scenario, but it cannot completely bring performance to parity with other constant-folded cases in Remi's shared benchmark (see https://github.com/liachmodded/jdk/commit/247e8bd92e6dbad6df2dd50ad83caa49983a81b4)

@liach
> ```java
> var isZero = hashIsZero;
> if (isZero == 1) return 0;
> if (isZero == 2) {
> 	int h = hash;
> 	if (h != 0) return h;
> }
> return computeHash(); // and set hash, hashIsZero fields
> ```

Instead of `hashIsZero`, the field should be called `hashState` in that case.

And maybe use `-1` for the `hashCode() == 0` state and `+1` for the `hashCode() != 0` state.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24625#issuecomment-2816567930


More information about the core-libs-dev mailing list