RFR: 8261744: Implement CharsetDecoder ASCII and latin-1 fast-paths [v2]

Philippe Marschall github.com+471021+marschall at openjdk.java.net
Thu Feb 18 20:01:44 UTC 2021


On Wed, 17 Feb 2021 17:22:21 GMT, Alan Bateman <alanb at openjdk.org> wrote:

>>> Right, I'm not exactly sure why the more limited changes I attempted in [5f4e87f](https://github.com/openjdk/jdk/commit/5f4e87f50f49e64b8616063c176ea35632b0347e) failed. In that change I simply changed the initialization order, which made the failing (closed) tests pass locally - but not in our CI. Since this is in initPhase1 there should be no concurrency possible.
>> 
>> The Reference Handler thread is started by the initializer in jl.ref.Reference so could be a candidate. The Finalizer thread is another but this should VM.awaitInitLevel(1) and not touch JLA until initPhase1 is done.
>
>> but you're probably right and it would be good to make the name more explicit when exporting it outside of the package internal use. How about `inflateBytesToChars`?
> 
> That should be okay.

> > > Is there a reason `sun.nio.cs.ISO_8859_1.Encoder#implEncodeISOArray(char[], int, byte[], int, int)` wasn't moved to `JavaLangAccess` as well?
> > 
> > 
> > Exposing StringUTF16.compress for Latin-1 and ASCII-compatible encoders seem very reasonable, which I was thinking of exploring next as a separate RFE.
> 
> Maybe I misunderstood. The intrinsified method you point out here pre-dates the work in JDK 9 to similarly intrinsify char[]->byte[] compaction in StringUTF16, see https://bugs.openjdk.java.net/browse/JDK-6896617
> 
> It might be worthwhile cleaning this up. Not having to route via SharedSecrets -> JavaLangAccess does speed things up during startup/interpretation, at the cost of some code duplication.

My understanding was ISO_8859_1$Encoder.implEncodeISOArray and StringUTF16.compress are ultimately hooked up to the same intrinsic. I find it inconsistent that ISO_8859_1$Encoder access an encoding intrinsitc directly while ISO_8859_1$Decoder and others access a decoding intrinsic indirectly through JavaLangAccess. I realize this RFE is about decoding so keeping encoding to a different RFE may indeed be better.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2574


More information about the core-libs-dev mailing list