RFR JDK-8011200 (was 7143928) : (coll) Optimize for Empty ArrayList and HashMap

Thu Apr 11 22:03:40 UTC 2013

On 04/11/2013 01:33 AM, Vitaly Davidovich wrote:
>
> The null check jumps should be taken care of by branch prediction; as 
> long as they're predictable, penalty on OOO CPU is minimal.  So modern 
> CPUs don't like mispredicted branches I'd say, not just any jump.
>

yes :)
Rémi

> On Apr 10, 2013 7:23 PM, "Remi Forax" <forax at univ-mlv.fr 
> <mailto:forax at univ-mlv.fr>> wrote:
>
>     On 04/10/2013 11:26 PM, Martin Buchholz wrote:
>
>         I'm willing to accept John as an authority on hotspot
>         optimization.
>         I'm surprised that null checks aren't more close to free in
>         part because
>         recent jsr166 code has been introducing more explicit null checks,
>         sometimes in code where the reference being checked is "known"
>         not to be
>         null.
>
>         Martin
>
>
>     null check that are never null are free when the interpreter has
>     never (or rarely) sees a null
>     at a specific callsite because in that case, the JIT doesn't
>     generate a null check and
>     let the CPU/MMU do a fault (the VM has a signal handler to be able
>     to come back from death :)
>
>     If you set a field to null, the profiler will see the null and
>     will not use this optimization,
>     that why in this case, it's better to have an empty array instead
>     of null.
>     BTW, the nullcheck in assembler cost you almost nothing anyway but
>     the jump
>     associated with it has a cost. Modern CPUs do not like to jump.
>
>     Rémi
>
>
>
>         On Wed, Apr 10, 2013 at 11:12 AM, Mike Duigou
>         <mike.duigou at oracle.com <mailto:mike.duigou at oracle.com>>wrote:
>
>             On Apr 9 2013, at 19:56 , Martin Buchholz wrote:
>             The use of an empty array rather than null was suggested
>             by John Rose who
>             said:
>
>             I recommend an empty array rather than null as a sentinel
>             value for two
>             reasons:
>
>             1. The JVM prefers to merge null checks into load or store
>             instructions
>             (so-called "implicit null checks") because it removes an
>             explicit branch.
>             But it only does so if the probability of nulls is zero or
>             very low. But
>             using null as a sentinel for common states (e.g., empty
>             collection) defeats
>             this optimization.
>
>             2. For power-of-two sized structures (HashMap) we can
>             optimize away an
>             array range check in the presence of a zero-length check.
>
>             Since most uses of a variable-sized collection load and
>             test the array
>             length, the sentinel check can easily be overloaded onto
>             this test. If null
>             is not used, then the (safety-mandated) null check is
>             (usually) merged into
>             the load of the length. If the table is
>             power-of-two-sized, then only the
>             zero check remains, and the array range check may be
>             removed. This is
>             thought to be the best code for a frequent load from a
>             possibly-empty
>             collection.
>
>             Mike asked, "what about empty collection?" This is a
>             reasonable thing to
>             use, but it has a cost. The JVM uses inline caches and
>             type profiles to
>             simplify its optimized code; these techniques "win" when
>             at a given use
>             point (individual call to Map.get, for example) there is
>             only one class
>             present, even if the interface is totally general. (This
>             is the so-called
>             "monomorphic" case.) If the application uses (say) only
>             HashMaps for both
>             empty and non-empty maps, then this optimization can win
>             big. It can be
>             broken, on the other hand, if the application begins to
>             use one other type
>             for some other case (such as empty maps). In these cases,
>             it is better to
>             overload the "am I empty?" test on some other loaded
>             value, such as a null
>             or (better) an array length.
>
>
>             Mike
>
>