RFR: 8326096: Deprecate getTotalIn, getTotalOut methods of java.util.zip.Inflater, java.util.zip.Deflater [v4]

Eirik Bjørsnøs eirbjo at openjdk.org
Tue Feb 20 13:31:53 UTC 2024


On Tue, 20 Feb 2024 11:28:08 GMT, Eirik Bjørsnøs <eirbjo at openjdk.org> wrote:

>> Please review this PR which proposes that we officially deprecate the following four methods in the `java.util.zip` package:
>> 
>> * `Inflater.getTotalIn()`
>> * `Inflater.getTotalOut()`
>> * `Deflater.getTotalIn()`
>> * `Deflater.getTotalOut()`
>> 
>> Since these legacy methods return `int`, they cannot safely return the number of bytes processed without the risk of losing information  about the magnitude or even sign of the returned value.
>> 
>> The corresponding methods `getBytesRead()` and `getBytesWritten()` methods introduced in Java 5 return `long`, and should be used instead when obtaining this information. 
>> 
>> Unrelated to the deprecation itself, the documentation currently does not specify what these methods are expected to return when the number of processed bytes is higher than `Integer.MAX_VALUE`. This PR aims to clarify this in the API specification.
>> 
>> Initally, this PR handles only `Inflater.getTotalIn()`. The other three methods will be updated once the wordsmithing for this method stabilizes.
>
> Eirik Bjørsnøs has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Use "greater than" instead of "larger than"
>  - Leave first sentence as-is. Simplify the deprecation notice.

Before any further wordsmithing of API changes, we should perhaps take one step back and try to reach consensus on the following question:

> Should these methods specify return values when the number of processed bytes exceed `Integer.MAX_VALUE`?

On one hand, it's in general good practise to specify the full range of possible return values for a method. 

On the other hand, one can argue that the value returned by the current implementation isn't particularly useful. Since the higher 32 bits are simply discarded, the loss of precision in magitude and even sign of the returned number makes it hard to see how the returned value can have any practical use for the caller. In fact, the caller cannot even distinguish a correct return value from an incorrect one. Basically, a return value from these methods cannot be trusted, regardless of the value.

Because of this, perhaps it is better to only specify the boundry conditions where a correct result cannot be returned?

I think we want to make it abundantly clear that the return values from calling these methods cannot be trusted. Meanwhile, we want to be concise and quickly lead the user down the correct path, which is to use the replacement methods instead.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17919#issuecomment-1954221075


More information about the core-libs-dev mailing list