RFR: 8179004: Add an efficient implementation of the "count trailing zeros" operation

Chris Plummer chris.plummer at oracle.com
Fri Apr 21 05:54:40 UTC 2017


On 4/19/17 11:17 PM, Kim Barrett wrote:
> Please review this addition of the count_trailing_zeros function.
>
> Unfortunately, there isn't an obvious direct and portable way to write
> such a function. But supported hardware architectures generally
> provide an efficient implementation, e.g. a single instruction or a
> short sequence.  Compilers often provide "built-in" or "intrinsic"
> access to that hardware implementation, or one can use inline
> assembler.
>
> If a platform doesn't have such a built-in efficient implementation,
> the function can be implemented using de Bruijn sequences as discussed
> in the literature.  But all of the OpenJDK-supported platforms provide
> an efficient implementation, so we aren't providing such a fallback.
>
> As part of reviewing this change, please feel free to suggest
> alternative ways to structure the code. I'm not completely happy with
> the way I've done it, but alternatives I've tried seemed worse.
Are you talking about the #if/#elseif structure and the ////// 
delimiters. I'd suggest replacing the ////// with a block comment that 
stands out. Something like:

/**************
  * GCC Support
  **************/

Otherwise it looks fine to me, but I'm no expert in this area. The 
testing looks satisfactory and the bitmap results you are getting are 
reason enough to add this functionality.

Chris
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8179004
>
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8179004/hotspot.00/
>
> Testing:
> Unit test for new function.
>
> Experimented with modifying BitMap::get_next_one_offset and the like
> to replace the existing for-loop with a call to this new function, and
> measured substantial speedups on all tested platforms for a test
> program that counts the bits in a bitmap by iterating calls to that
> search function. The speedup varies with the density of set bits, with
> very sparse or nearly full being similar to the existing code, since
> the bit counting is not the dominant factor in those situations. But
> in between give speedups of factors of x1.5 to x5 or more, depending
> on the density and the platform.
>



More information about the hotspot-dev mailing list