Unexpected behaviour with larger Strings
Raffaello Giulietti
raffaello.giulietti at gmail.com
Mon Apr 20 18:56:58 UTC 2020
Hi,
I'm on Linux, but the explanation might be the same as the following one.
An easier way to obtain the same error on OpenJDK8 + HotSpot is to execute
byte[] b = new byte[Integer.MAX_VALUE];
which is exactly what happens behind the scenes in the UTF-8 case.
The encoder pessimistically assumes that each char will be encoded to at
most 3 bytes. The expansion factor 3, however, is expressed as the float
3.0f. This, in turn, is first converted to the double 3.0, multiplied by
your length 1 << 30 and cast to int. As the product overflows the int
range, the cast produces Integer.MAX_VALUE.
While Integer.MAX_VALUE should be considered a legal array size, I
recall to have read somewhere that implementations are allowed to be a
little bit more restrictive. Experimentally, the maximum size for a
byte[] on OpenJDK8 + HotSpot / Linux is Integer.MAX_VALUE - 2. I guess
it is the same on macOS.
Hope this helps.
Greetings
Raffaello
More information about the core-libs-dev
mailing list