[PATCH] Performance bug in String(byte[],int,int,Charset)
Rémi Forax
forax at univ-mlv.fr
Sun Nov 25 15:20:52 UTC 2007
Markus Gaisbauer a écrit :
> A bug in java.lang.StringCoding causes a full and unnecessary copy of
> the byte array given as the first argument.
>
it's not a bug, it's a feature :)
i think this copy is a defensive copy to avoid malicious charser
(decoder) to
access the underlying buffer.
By the way, using clone() seams better than Arrays.copyOf() here.
byte[] b = ba.clone();
> This results in severe slow down of the Constructor if the byte array is
> big.
>
Rémi
> The attached patch, should fix the problem.
>
>
> Unfortunately I do not (yet) have an official bug id for this, as this
> seems to take a while (reported 2 weeks ago).
>
> To reproduce the problem run the following test program:
>
> import java.nio.charset.Charset;
>
> public class StringTest {
>
> public static void main(String[] args) throws Exception {
> long before;
> long after;
> byte[] data;
>
> data = new byte[1024*1024*16]; // 16 megabyte
> data[0] = 'X';
>
> // warmup
> new String(data, 0, 1);
> new String(data, 0, 1, Charset.forName("UTF8"));
> new String(data, 0, 1, "UTF8");
>
> before = System.nanoTime();
> new String(data, 0, 1);
> after = System.nanoTime();
> System.out.println((after - before) / 1000000 + "ms");
>
> before = System.nanoTime();
> new String(data, 0, 1, Charset.forName("UTF8"));
> after = System.nanoTime();
> System.out.println((after - before) / 1000000 + "ms");
>
> before = System.nanoTime();
> new String(data, 0, 1, "UTF8");
> after = System.nanoTime();
> System.out.println((after - before) / 1000000 + "ms");
> }
>
> }
>
> ------------------------------------------------------------------------
>
> Index: StringCoding.java
> ===================================================================
> --- StringCoding.java (revision 258)
> +++ StringCoding.java (working copy)
> @@ -193,7 +193,6 @@
>
> static char[] decode(Charset cs, byte[] ba, int off, int len) {
> StringDecoder sd = new StringDecoder(cs, cs.name());
> - byte[] b = Arrays.copyOf(ba, ba.length);
> return sd.decode(b, off, len);
> }
>
>
More information about the core-libs-dev
mailing list