RFR: 8179004: Add an efficient implementation of the "count trailing zeros" operation
David Holmes
david.holmes at oracle.com
Fri May 5 08:02:53 UTC 2017
On 5/05/2017 3:43 PM, Kim Barrett wrote:
> Still looking for a Reviewer.
Reviewed.
I don't see an obviously better way to deal with the structure.
Thanks,
David
>> On Apr 20, 2017, at 2:17 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>
>> Please review this addition of the count_trailing_zeros function.
>>
>> Unfortunately, there isn't an obvious direct and portable way to write
>> such a function. But supported hardware architectures generally
>> provide an efficient implementation, e.g. a single instruction or a
>> short sequence. Compilers often provide "built-in" or "intrinsic"
>> access to that hardware implementation, or one can use inline
>> assembler.
>>
>> If a platform doesn't have such a built-in efficient implementation,
>> the function can be implemented using de Bruijn sequences as discussed
>> in the literature. But all of the OpenJDK-supported platforms provide
>> an efficient implementation, so we aren't providing such a fallback.
>>
>> As part of reviewing this change, please feel free to suggest
>> alternative ways to structure the code. I'm not completely happy with
>> the way I've done it, but alternatives I've tried seemed worse.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8179004
>>
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8179004/hotspot.00/
>>
>> Testing:
>> Unit test for new function.
>>
>> Experimented with modifying BitMap::get_next_one_offset and the like
>> to replace the existing for-loop with a call to this new function, and
>> measured substantial speedups on all tested platforms for a test
>> program that counts the bits in a bitmap by iterating calls to that
>> search function. The speedup varies with the density of set bits, with
>> very sparse or nearly full being similar to the existing code, since
>> the bit counting is not the dominant factor in those situations. But
>> in between give speedups of factors of x1.5 to x5 or more, depending
>> on the density and the platform.
>
>
More information about the hotspot-dev
mailing list