[PATCH] Performance bug in String(byte[],int,int,Charset)

Rémi Forax forax at univ-mlv.fr
Sun Nov 25 15:20:52 UTC 2007


Markus Gaisbauer a écrit :
> A bug in java.lang.StringCoding causes a full and unnecessary copy of
> the byte array given as the first argument.
>   
it's not a bug, it's a feature :)
i think this copy is a defensive copy to avoid malicious charser 
(decoder) to
access the underlying buffer.

By the way, using clone() seams better than Arrays.copyOf() here.

byte[] b = ba.clone();


> This results in severe slow down of the Constructor if the byte array is
> big.
>   
Rémi
> The attached patch, should fix the problem.
>
>
> Unfortunately I do not (yet) have an official bug id for this, as this
> seems to take a while (reported 2 weeks ago).
>
> To reproduce the problem run the following test program:
>
> import java.nio.charset.Charset;
>
> public class StringTest {
>
>        public static void main(String[] args) throws Exception {
>                long before;
>                long after;
>                byte[] data;
>
>                data = new byte[1024*1024*16]; // 16 megabyte
>                data[0] = 'X';
>
>                // warmup
>                new String(data, 0, 1);
>                new String(data, 0, 1, Charset.forName("UTF8"));
>                new String(data, 0, 1, "UTF8");
>
>                before = System.nanoTime();
>                new String(data, 0, 1);
>                after = System.nanoTime();
>                System.out.println((after - before) / 1000000 + "ms");
>
>                before = System.nanoTime();
>                new String(data, 0, 1, Charset.forName("UTF8"));
>                after = System.nanoTime();
>                System.out.println((after - before) / 1000000 + "ms");
>
>                before = System.nanoTime();
>                new String(data, 0, 1, "UTF8");
>                after = System.nanoTime();
>                System.out.println((after - before) / 1000000 + "ms");
>        }
>
> }
>   
> ------------------------------------------------------------------------
>
> Index: StringCoding.java
> ===================================================================
> --- StringCoding.java	(revision 258)
> +++ StringCoding.java	(working copy)
> @@ -193,7 +193,6 @@
>  
>      static char[] decode(Charset cs, byte[] ba, int off, int len) {
>          StringDecoder sd = new StringDecoder(cs, cs.name());
> -        byte[] b = Arrays.copyOf(ba, ba.length);
>          return sd.decode(b, off, len);
>      }
>  
>   




More information about the core-libs-dev mailing list