RFR: 8255455: Refactor ThreadHeapSampler::_log_table as constexpr

Claes Redestad redestad at openjdk.java.net
Thu Oct 29 14:14:46 UTC 2020


On Thu, 29 Oct 2020 08:56:13 GMT, Lin Zang <lzang at openjdk.org> wrote:

>> The static `ThreadHeapSampler::_log_table` is currently initialized on JVM bootstrap to an overhead of ~67k instructions (linux-x64). By turning the initialization into a constexpr, we can precalculate the helper table at compile time, which trades a runtime overhead for a small, 8kb, static footprint increase.
>> 
>> I compared `fast_log2` with the `log2` builtin with a naive benchmarking experiment[1] (not included in this PR) and show that the `fast_log2` is ~2.5x faster than `log2` on my system. And that without the lookup table we'd be much worse. So I think it makes sense to preserve this optimization, but get rid of the startup overhead:
>> 
>> [5.428s][debug][heapsampling] log2, 0.0751173 secs
>> [5.457s][debug][heapsampling] fast_log2, 0.0298244 secs
>> [5.622s][debug][heapsampling] fast_log2_uncached, 0.1645569 secs
>> 
>> I've verified that this refactoring does not affect performance in this naive setup.
>> 
>> [1] https://github.com/openjdk/jdk/compare/master...cl4es:log2_micro?expand=1
>
> Dear @cl4es, 
> I am not a reviewer, just have 1 comment that maybe you need to update the Year info in the headers of touched files. 
> 
> Thanks.
> Lin

Unfortunately there's currently no portable way to use `std::log` (or any of the other `std` math functions) in a constexpr, so I had to resort to a code generator approach instead. It's either that or withdrawing this PR.

Using UL and a debug-only block to implement an adhoc code generator (`-Xlog:heapsampling+generate::none`) might be a bit unorthodox, but I think it turned out OK.

-------------

PR: https://git.openjdk.java.net/jdk/pull/880


More information about the hotspot-runtime-dev mailing list