RFR: 8293170: Improve encoding of the debuginfo nmethod section

Serguei Spitsyn sspitsyn at openjdk.org
Thu Jul 6 22:35:55 UTC 2023


On Thu, 2 Feb 2023 12:54:06 GMT, Boris Ulasevich <bulasevich at openjdk.org> wrote:

> This is another pull request to replace https://github.com/openjdk/jdk/pull/10025 change which was blocked as not acceptable (see https://github.com/openjdk/jdk/pull/10025#pullrequestreview-1228216330)
> 
> The objections to change #10025 were:
> - specialized algorithm for given data complicates things, makes it hard to learn, test and support
> - algorithm is changed for DebugInfo, and the benefit is only for one type of data
> - statistics of the debug info data can (will) change, breaking the optimization
> 
> The suggestion was:
> - don't change the core algorithm, but add one on top or underneath the existing one, or reuse off-the-shelf zero-reduction schemes such as Cap'n Proto
> 
> With this change I propose a different approach. Instead of bit coding, the sequence of zeros in a data stream is encoded with a special character that normally never appears in Unsinged5 encoding, followed by a byte containing a number of zeros. In this way the updated algorithm is a pure extension of the existing encoding algorithm: data encoded without the zero-reduction trick is unpacked in the same way as before.
> 
> Currently there are several datasets affected by this change: Dependencies info, OopMap info, LineNumber info, Debug info. Only Debug info has a large number of zeros and gets a significant benefit. I experimented with the Cap'n Proto and lz4 algorithms on DebugInfo. The Unsinged5 algorithm has a better compression rate than these.
> 
> DebugInfo data size is reduced by ~20% (actually, 10-30%, depending on the application). Total nmethod size reduction is ~3%.
> 
> Performance impact: Renaisance and DaCapo benchmarks do not show any difference.

src/hotspot/share/code/compressedStream.hpp line 118:

> 116:   bool handle_zero(juint value) {
> 117:     if (value == 0) {
> 118:       _zero_count = (_zero_count == 0xFF) ? 0 : _zero_count;

The case of `_zero_count` overflow is not clear. Apparently, I'm missing something here.
Current code is just clearing the previously counted `_zero_count`.
I'd expect some action like storing the current number of zeros or advancing the `_position`.
Do you have a test for this?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/12387#discussion_r1254988216


More information about the serviceability-dev mailing list