potential performance improvement in sun.nio.cs.UTF_8

Johannes Döbler jd at civilian-framework.org
Mon May 12 15:46:24 UTC 2025


Hi Chen,

thanks for your feedback. Indeed it does not make sense to optimize 
UTF-8 processing for a rather vague set of beneficiaries when there are 
realistic counterexamples.
Still I don't want to give up on my idea too early :-)
I tried this modification:

  * harvest pure ASCII-bytes before the loop (as in the current decoder)
  * within the loop if a 1-byte-UTF8-sequence is recognized invoke
    JLA.decodeAscii but only limited times (e.g. 10), else just copy the
    byte to the output buffer (like in the current implementation)
  * in my benchmark timings this give the JLA.decodeAscii-boost for
    inputs which have rather long ASCII input sequences, whereas not
    degrading performance due to JLA call overhead in other scenarios

Thanks
Johannes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250512/9ad5648e/attachment.htm>


More information about the core-libs-dev mailing list