RFR JDK-8011200 (was 7143928) : (coll) Optimize for Empty ArrayList and HashMap

Wed Apr 10 23:18:23 UTC 2013

On 04/10/2013 11:26 PM, Martin Buchholz wrote:
> I'm willing to accept John as an authority on hotspot optimization.
> I'm surprised that null checks aren't more close to free in part because
> recent jsr166 code has been introducing more explicit null checks,
> sometimes in code where the reference being checked is "known" not to be
> null.
>
> Martin

null check that are never null are free when the interpreter has never 
(or rarely) sees a null
at a specific callsite because in that case, the JIT doesn't generate a 
null check and
let the CPU/MMU do a fault (the VM has a signal handler to be able to 
come back from death :)

If you set a field to null, the profiler will see the null and will not 
use this optimization,
that why in this case, it's better to have an empty array instead of null.
BTW, the nullcheck in assembler cost you almost nothing anyway but the jump
associated with it has a cost. Modern CPUs do not like to jump.

Rémi

>
>
> On Wed, Apr 10, 2013 at 11:12 AM, Mike Duigou <mike.duigou at oracle.com>wrote:
>
>> On Apr 9 2013, at 19:56 , Martin Buchholz wrote:
>> The use of an empty array rather than null was suggested by John Rose who
>> said:
>>
>> I recommend an empty array rather than null as a sentinel value for two
>> reasons:
>>
>> 1. The JVM prefers to merge null checks into load or store instructions
>> (so-called "implicit null checks") because it removes an explicit branch.
>> But it only does so if the probability of nulls is zero or very low. But
>> using null as a sentinel for common states (e.g., empty collection) defeats
>> this optimization.
>>
>> 2. For power-of-two sized structures (HashMap) we can optimize away an
>> array range check in the presence of a zero-length check.
>>
>> Since most uses of a variable-sized collection load and test the array
>> length, the sentinel check can easily be overloaded onto this test. If null
>> is not used, then the (safety-mandated) null check is (usually) merged into
>> the load of the length. If the table is power-of-two-sized, then only the
>> zero check remains, and the array range check may be removed. This is
>> thought to be the best code for a frequent load from a possibly-empty
>> collection.
>>
>> Mike asked, "what about empty collection?" This is a reasonable thing to
>> use, but it has a cost. The JVM uses inline caches and type profiles to
>> simplify its optimized code; these techniques "win" when at a given use
>> point (individual call to Map.get, for example) there is only one class
>> present, even if the interface is totally general. (This is the so-called
>> "monomorphic" case.) If the application uses (say) only HashMaps for both
>> empty and non-empty maps, then this optimization can win big. It can be
>> broken, on the other hand, if the application begins to use one other type
>> for some other case (such as empty maps). In these cases, it is better to
>> overload the "am I empty?" test on some other loaded value, such as a null
>> or (better) an array length.
>>
>>
>> Mike
>>