RFR: 8345120: A likely bug in StringSupport::chunkedStrlenShort [v3]
Maurizio Cimadamore
mcimadamore at openjdk.org
Fri Nov 29 12:47:38 UTC 2024
On Fri, 29 Nov 2024 12:36:19 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>> Per Minborg has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Fix imports
>
> src/java.base/share/classes/jdk/internal/foreign/StringSupport.java line 187:
>
>> 185: // Prevent over scanning as we step by 2
>> 186: final long endScan = toOffset & ~1; // The last bit is zero
>> 187: for (; offset < endScan; offset += Short.BYTES) {
>
> Question: do we care about finding unaligned sequences of two consecutive zero bytes? Because I don't think this method can detect those. E.g. if the first zero byte in a terminator sequence starts at an odd offset I don't think we can recognize the terminator.
>
> This is a bit of a weird side-effect, because now this method no longer behaves like a plain byte loop where we look for two consecutive bytes. But, I suppose, in a way it also makes sense: if I want to read a C string whose characters are 2-bytes each, then I definitively need to scan the segment (at given offset) two bytes at a time. A terminator at an "odd offset" from the start of the scan is not good. I believe the javadoc is a bit "light" on how this terminator is actually found :-)
Even the fact that the last byte is ignored (if length is odd), or that up to 3 bytes might be ignored in the `strlenInt` might not be obvious. Maybe we should file a followup PR for improving the javadoc?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22451#discussion_r1863473652
More information about the core-libs-dev
mailing list