<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">Hi Chen,<br>
<br>
thanks for your feedback. Indeed it does not make sense to
optimize UTF-8 processing for a rather vague set of beneficiaries
when there are realistic counterexamples.<br>
Still I don't want to give up on my idea too early :-)<br>
I tried this modification:<br>
<ul>
<li>harvest pure ASCII-bytes before the loop (as in the current
decoder)</li>
<li>within the loop if a 1-byte-UTF8-sequence is recognized
invoke JLA.decodeAscii but only limited times (e.g. 10), else
just copy the byte to the output buffer (like in the current
implementation)<br>
</li>
<li>in my benchmark timings this give the JLA.decodeAscii-boost
for inputs which have rather long ASCII input sequences,
whereas not degrading performance due to JLA call overhead in
other scenarios<br>
</li>
</ul>
Thanks<br>
Johannes</div>
</body>
</html>