From vincent.ryan at sun.com  Mon Mar  1 18:00:23 2010
From: vincent.ryan at sun.com (vincent.ryan at sun.com)
Date: Mon, 01 Mar 2010 18:00:23 +0000
Subject: hg: jdk7/tl/jdk: 2 new changesets
Message-ID: <20100301180103.D0ACF41DBD@hg.openjdk.java.net>

Changeset: 78d91c4223cb
Author:    vinnie
Date:      2010-03-01 17:54 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/78d91c4223cb

6921001: api/java_security/IdentityScope/IdentityScopeTests.html#getSystemScope fails starting from b78 JDK7
Reviewed-by: mullan

! src/share/classes/java/security/IdentityScope.java
! src/share/lib/security/java.security
+ test/java/security/IdentityScope/NoDefaultSystemScope.java

Changeset: 893034df4ec2
Author:    vinnie
Date:      2010-03-01 18:00 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/893034df4ec2

Merge

- test/java/nio/file/WatchService/OverflowEventIsLoner.java


From joe.darcy at sun.com  Tue Mar  2 21:57:06 2010
From: joe.darcy at sun.com (joe.darcy at sun.com)
Date: Tue, 02 Mar 2010 21:57:06 +0000
Subject: hg: jdk7/tl/langtools: 6931130: Remove unused AnnotationCollector
	code from JavacProcessingEnvironment
Message-ID: <20100302215709.B65C341F86@hg.openjdk.java.net>

Changeset: 7c23bbbe0dbd
Author:    darcy
Date:      2010-03-02 14:06 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/7c23bbbe0dbd

6931130: Remove unused AnnotationCollector code from JavacProcessingEnvironment
Reviewed-by: jjg

! src/share/classes/com/sun/tools/javac/processing/JavacProcessingEnvironment.java


From Ulf.Zibis at gmx.de  Tue Mar  2 23:34:08 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 03 Mar 2010 00:34:08 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4A9578C4.8060801@sun.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
Message-ID: <4B8DA070.3040306@gmx.de>

Am 26.08.2009 20:02, schrieb Xueming Shen:
>
> For example, the isBMP(int), it might be convenient, but it can be 
> easily archived by the one line code
>
> (int)(char)codePoint == codePoint;
>
> or more readable form
>
>    codePoint < Character.MIN_SUPPLEMENTARY_COE_POINT;
>

In class sun.nio.cs.Surrogate we have:
     public static boolean isBMP(int uc) {
         return (int) (char) uc == uc;
     }

1.) It's enough to have:
         return (char)uc == uc;
     better:
         assert MIN_VALUE == 0 && MAX_VALUE == 0xFFFF;
         return (char)uc == uc;
         // Optimized form of: uc >= MIN_VALUE && uc <= MAX_VALUE

2.) Above code is compiled to (needs 16 bytes of machine code):
   0x00b87ad8: mov    %ebx,%ebp
   0x00b87ada: and    $0xffff,%ebp
   0x00b87ae0: cmp    %ebx,%ebp
   0x00b87ae2: jne    0x00b87c52
   0x00b87ae8:

     We could code:
         assert MIN_VALUE == 0 && (MAX_VALUE + 1) == (1 << 16);
         return (uc >> 16) == 0;
         // Optimized form of: uc >= MIN_VALUE && uc <= MAX_VALUE

     is compiled to (needs only 9 bytes of machine code):
   0x00b87aac: mov    %ebx,%ecx
   0x00b87aae: sar    $0x10,%ecx
   0x00b87ab1: test   %ecx,%ecx
   0x00b87ab3: je     0x00b87acb
   0x00b87ab5:

1.) If we have:
     public static boolean isSupplementaryCodePoint(int codePoint) {
         assert MIN_SUPPLEMENTARY_CODE_POINT == (1 << 16) &&
                 (MAX_SUPPLEMENTARY_CODE_POINT + 1) % (1 << 16) == 0;
         return (codePoint >> 16) != 0
&& (codePoint >> 16) < (MAX_SUPPLEMENTARY_CODE_POINT + 1 >> 16);
         // Optimized form of: codePoint >= MIN_SUPPLEMENTARY_CODE_POINT
         // && codePoint <= MAX_SUPPLEMENTARY_CODE_POINT;
     }
and:
         if (Surrogate.isBMP(uc))
             ...;
         else if (Character.isSupplementaryCodePoint(uc))
             ...;
         else
             ...;

     we get (needs only 18 bytes of machine code):
   0x00b87aac: mov    %ebx,%ecx
   0x00b87aae: sar    $0x10,%ecx
   0x00b87ab1: test   %ecx,%ecx
   0x00b87ab3: je     0x00b87acb
   0x00b87ab5: cmp    $0x11,%ecx
   0x00b87ab8: jge    0x00b87ce6
   0x00b87abe:


-Ulf


From jonathan.gibbons at sun.com  Wed Mar  3 00:41:43 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Wed, 03 Mar 2010 00:41:43 +0000
Subject: hg: jdk7/tl/langtools: 6931482: minor findbugs fixes
Message-ID: <20100303004146.B692D41FB5@hg.openjdk.java.net>

Changeset: 6e1e2738c530
Author:    jjg
Date:      2010-03-02 16:40 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/6e1e2738c530

6931482: minor findbugs fixes
Reviewed-by: darcy

! src/share/classes/com/sun/tools/classfile/ConstantPool.java
! src/share/classes/com/sun/tools/javadoc/DocEnv.java
! src/share/classes/com/sun/tools/javadoc/SeeTagImpl.java


From jonathan.gibbons at sun.com  Wed Mar  3 00:44:42 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Wed, 03 Mar 2010 00:44:42 +0000
Subject: hg: jdk7/tl/langtools: 6931127: strange test class files
Message-ID: <20100303004445.7B6E041FB6@hg.openjdk.java.net>

Changeset: 235135d61974
Author:    jjg
Date:      2010-03-02 16:43 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/235135d61974

6931127: strange test class files
Reviewed-by: darcy

! test/tools/javac/annotations/neg/Constant.java
! test/tools/javac/generics/Casting.java
! test/tools/javac/generics/Casting3.java
! test/tools/javac/generics/Casting4.java
! test/tools/javac/generics/InnerInterface1.java
! test/tools/javac/generics/InnerInterface2.java
! test/tools/javac/generics/Multibound1.java
! test/tools/javac/generics/MultipleInheritance.java
! test/tools/javac/generics/NameOrder.java
! test/tools/javac/generics/PermuteBound.java
! test/tools/javac/generics/PrimitiveVariant.java


From martinrb at google.com  Wed Mar  3 08:00:05 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 3 Mar 2010 00:00:05 -0800
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4B8DA070.3040306@gmx.de>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
Message-ID: <1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>

On Tue, Mar 2, 2010 at 15:34, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 26.08.2009 20:02, schrieb Xueming Shen:
>>
>> For example, the isBMP(int), it might be convenient, but it can be easily
>> archived by the one line code
>>
>> (int)(char)codePoint == codePoint;
>>
>> or more readable form
>>
>> ? codePoint < Character.MIN_SUPPLEMENTARY_COE_POINT;
>>
>
> In class sun.nio.cs.Surrogate we have:
> ? ?public static boolean isBMP(int uc) {
> ? ? ? ?return (int) (char) uc == uc;
> ? ?}
>
> 1.) It's enough to have:
> ? ? ? ?return (char)uc == uc;
> ? ?better:
> ? ? ? ?assert MIN_VALUE == 0 && MAX_VALUE == 0xFFFF;
> ? ? ? ?return (char)uc == uc;
> ? ? ? ?// Optimized form of: uc >= MIN_VALUE && uc <= MAX_VALUE
>
> 2.) Above code is compiled to (needs 16 bytes of machine code):
> ?0x00b87ad8: mov ? ?%ebx,%ebp
> ?0x00b87ada: and ? ?$0xffff,%ebp
> ?0x00b87ae0: cmp ? ?%ebx,%ebp
> ?0x00b87ae2: jne ? ?0x00b87c52
> ?0x00b87ae8:
>
> ? ?We could code:
> ? ? ? ?assert MIN_VALUE == 0 && (MAX_VALUE + 1) == (1 << 16);
> ? ? ? ?return (uc >> 16) == 0;
> ? ? ? ?// Optimized form of: uc >= MIN_VALUE && uc <= MAX_VALUE

I agree that
return (uc >> 16) == 0;
is marginally better than my
return (int) (char) uc == uc;
(although I think the redundant cast to int
makes the code more readable).
I approve such a change to isBMPCodePoint()
and inclusion of such a method in Character.


> ? ?is compiled to (needs only 9 bytes of machine code):
> ?0x00b87aac: mov ? ?%ebx,%ecx
> ?0x00b87aae: sar ? ?$0x10,%ecx
> ?0x00b87ab1: test ? %ecx,%ecx
> ?0x00b87ab3: je ? ? 0x00b87acb
> ?0x00b87ab5:
>
> 1.) If we have:
> ? ?public static boolean isSupplementaryCodePoint(int codePoint) {
> ? ? ? ?assert MIN_SUPPLEMENTARY_CODE_POINT == (1 << 16) &&
> ? ? ? ? ? ? ? ?(MAX_SUPPLEMENTARY_CODE_POINT + 1) % (1 << 16) == 0;
> ? ? ? ?return (codePoint >> 16) != 0
> && (codePoint >> 16) < (MAX_SUPPLEMENTARY_CODE_POINT + 1 >> 16);
> ? ? ? ?// Optimized form of: codePoint >= MIN_SUPPLEMENTARY_CODE_POINT
> ? ? ? ?// && codePoint <= MAX_SUPPLEMENTARY_CODE_POINT;
> ? ?}

Keep in mind that supplementary characters are extremely rare.
Therefore the existing implementation

 return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT
            && codePoint <= MAX_CODE_POINT;

will almost always perform just one comparison against a constant,
which is hard to beat.

I'm not sure whether your code above gets the right answer for negative input.
Perhaps you need to do
(codePoint >>> 16) < ...

Martin

> and:
> ? ? ? ?if (Surrogate.isBMP(uc))
> ? ? ? ? ? ?...;
> ? ? ? ?else if (Character.isSupplementaryCodePoint(uc))
> ? ? ? ? ? ?...;
> ? ? ? ?else
> ? ? ? ? ? ?...;
>
> ? ?we get (needs only 18 bytes of machine code):
> ?0x00b87aac: mov ? ?%ebx,%ecx
> ?0x00b87aae: sar ? ?$0x10,%ecx
> ?0x00b87ab1: test ? %ecx,%ecx
> ?0x00b87ab3: je ? ? 0x00b87acb
> ?0x00b87ab5: cmp ? ?$0x11,%ecx
> ?0x00b87ab8: jge ? ?0x00b87ce6
> ?0x00b87abe:
>
>
> -Ulf
>
>
>
>
>


From Ulf.Zibis at gmx.de  Wed Mar  3 10:44:51 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 03 Mar 2010 11:44:51 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>	
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
Message-ID: <4B8E3DA3.7090902@gmx.de>

Am 03.03.2010 09:00, schrieb Martin Buchholz:
> On Tue, Mar 2, 2010 at 15:34, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 26.08.2009 20:02, schrieb Xueming Shen:
>>      
>>> For example, the isBMP(int), it might be convenient, but it can be easily
>>> archived by the one line code
>>>
>>> (int)(char)codePoint == codePoint;
>>>
>>> or more readable form
>>>
>>>    codePoint<  Character.MIN_SUPPLEMENTARY_COE_POINT;
>>>
>>>        
>> In class sun.nio.cs.Surrogate we have:
>>     public static boolean isBMP(int uc) {
>>         return (int) (char) uc == uc;
>>     }
>>
>> 1.) It's enough to have:
>>         return (char)uc == uc;
>>     better:
>>         assert MIN_VALUE == 0&&  MAX_VALUE == 0xFFFF;
>>         return (char)uc == uc;
>>         // Optimized form of: uc>= MIN_VALUE&&  uc<= MAX_VALUE
>>
>> 2.) Above code is compiled to (needs 16 bytes of machine code):
>>   0x00b87ad8: mov    %ebx,%ebp
>>   0x00b87ada: and    $0xffff,%ebp
>>   0x00b87ae0: cmp    %ebx,%ebp
>>   0x00b87ae2: jne    0x00b87c52
>>   0x00b87ae8:
>>
>>     We could code:
>>         assert MIN_VALUE == 0&&  (MAX_VALUE + 1) == (1<<  16);
>>         return (uc>>  16) == 0;
>>         // Optimized form of: uc>= MIN_VALUE&&  uc<= MAX_VALUE
>>      
> I agree that
> return (uc>>  16) == 0;
> is marginally better than my
> return (int) (char) uc == uc;
> (although I think the redundant cast to int
> makes the code more readable).
>    

Seems to be individual. I always stumble over superfluous casts by 
thinking about, what they have to do.

> I approve such a change to isBMPCodePoint()
> and inclusion of such a method in Character.
>    

Pleased! Who could file the bug? I would provide the patch.

>
>    
>>     is compiled to (needs only 9 bytes of machine code):
>>   0x00b87aac: mov    %ebx,%ecx
>>   0x00b87aae: sar    $0x10,%ecx
>>   0x00b87ab1: test   %ecx,%ecx
>>   0x00b87ab3: je     0x00b87acb
>>   0x00b87ab5:
>>
>> 1.) If we have:
>>     public static boolean isSupplementaryCodePoint(int codePoint) {
>>         assert MIN_SUPPLEMENTARY_CODE_POINT == (1<<  16)&&
>>                 (MAX_SUPPLEMENTARY_CODE_POINT + 1) % (1<<  16) == 0;
>>         return (codePoint>>  16) != 0
>> &&  (codePoint>>  16)<  (MAX_SUPPLEMENTARY_CODE_POINT + 1>>  16);
>>         // Optimized form of: codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>         //&&  codePoint<= MAX_SUPPLEMENTARY_CODE_POINT;
>>     }
>>      
> Keep in mind that supplementary characters are extremely rare.
>    

Yes, but many API's in the JDK are used rarely.
Why should they waste memory footprint / perform bad, particularly if it 
doesn't cost anything.

> Therefore the existing implementation
>
>   return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>              &&  codePoint<= MAX_CODE_POINT;
>
> will almost always perform just one comparison against a constant,
> which is hard to beat.
>    

1. Wondering: I think there are TWO comparisons.
2. Those comparisons need to load 32 bit values from machine code, 
against only 8 bit values in my case.
3. The first of the 2 comparisons becomes outlined if compiled in 
combination with isBMPCodePoint(). (see below)


> I'm not sure whether your code above gets the right answer for negative input.
> Perhaps you need to do
> (codePoint>>>  16)<  ...
>    

Oops, I'm afraid you are right, but fortunately it doesn't cost 
anything: sar becomes replaced by shr.

> Martin
>
>    
>> and:
>>         if (Surrogate.isBMP(uc))
>>             ...;
>>         else if (Character.isSupplementaryCodePoint(uc))
>>             ...;
>>         else
>>             ...;
>>
>>     we get (needs only 18 bytes of machine code):
>>   0x00b87aac: mov    %ebx,%ecx
>>   0x00b87aae: sar    $0x10,%ecx
>>   0x00b87ab1: test   %ecx,%ecx
>>   0x00b87ab3: je     0x00b87acb
>>   0x00b87ab5: cmp    $0x11,%ecx
>>   0x00b87ab8: jge    0x00b87ce6
>>   0x00b87abe:
>>
>>      

BTW the compiled code of the existing code (needs *36* bytes of machine 
code):
   0x00b87a5c: test   %ebp,%ebp
   0x00b87a5e: jl     0x00b87a68
   0x00b87a60: cmp    $0x10000,%ebp
   0x00b87a66: jl     0x00b87a8d
   0x00b87a68: cmp    $0x10000,%ebp
   0x00b87a6e: jl     0x00b87c63
   0x00b87a74: cmp    $0x10ffff,%ebp
   0x00b87a7a: jg     0x00b87c63
   0x00b87a80:

BTW 2: The code example is seen in one of the String constructors, where 
there the 2-comparison code is manually inlined instead of using 
Surrogate.isBMP().

-Ulf


From martinrb at google.com  Wed Mar  3 16:06:14 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 3 Mar 2010 08:06:14 -0800
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4B8E3DA3.7090902@gmx.de>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
Message-ID: <1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>

Sherman, would you like to file bugs for Ulf's improvements?

On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 03.03.2010 09:00, schrieb Martin Buchholz:

>> Keep in mind that supplementary characters are extremely rare.
>>
>
> Yes, but many API's in the JDK are used rarely.
> Why should they waste memory footprint / perform bad, particularly if it
> doesn't cost anything.

I admire your perfectionism.

>> Therefore the existing implementation
>>
>> ?return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>> ? ? ? ? ? ? && ?codePoint<= MAX_CODE_POINT;
>>
>> will almost always perform just one comparison against a constant,
>> which is hard to beat.
>>
>
> 1. Wondering: I think there are TWO comparisons.
> 2. Those comparisons need to load 32 bit values from machine code, against
> only 8 bit values in my case.

It's a good point.  In the machine code, shifts are likely to use
immediate values, and so will be a small win.

int x = codePoint >>> 16;
return x != 0 && x < 0x11;

(On modern hardware, these optimizations
are less valuable than they used to be;
ordinary integer arithmetic is almost free)

Martin


From alan.bateman at sun.com  Wed Mar  3 16:10:11 2010
From: alan.bateman at sun.com (alan.bateman at sun.com)
Date: Wed, 03 Mar 2010 16:10:11 +0000
Subject: hg: jdk7/tl/jdk: 6931216: TEST_BUG:
	test/java/nio/file/WatchService/LotsOfEvents.java failed with NPE
Message-ID: <20100303161050.CC14143C39@hg.openjdk.java.net>

Changeset: cddb43b12d28
Author:    alanb
Date:      2010-03-03 16:09 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/cddb43b12d28

6931216: TEST_BUG: test/java/nio/file/WatchService/LotsOfEvents.java failed with NPE
Reviewed-by: chegar

! test/java/nio/file/WatchService/LotsOfEvents.java


From Xueming.Shen at Sun.COM  Wed Mar  3 19:11:40 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Wed, 03 Mar 2010 11:11:40 -0800
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
Message-ID: <4B8EB46C.1010208@sun.com>

#6931812

Martin Buchholz wrote:
> Sherman, would you like to file bugs for Ulf's improvements?
>
> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>   
>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>     
>
>   
>>> Keep in mind that supplementary characters are extremely rare.
>>>
>>>       
>> Yes, but many API's in the JDK are used rarely.
>> Why should they waste memory footprint / perform bad, particularly if it
>> doesn't cost anything.
>>     
>
> I admire your perfectionism.
>
>   
>>> Therefore the existing implementation
>>>
>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>             &&  codePoint<= MAX_CODE_POINT;
>>>
>>> will almost always perform just one comparison against a constant,
>>> which is hard to beat.
>>>
>>>       
>> 1. Wondering: I think there are TWO comparisons.
>> 2. Those comparisons need to load 32 bit values from machine code, against
>> only 8 bit values in my case.
>>     
>
> It's a good point.  In the machine code, shifts are likely to use
> immediate values, and so will be a small win.
>
> int x = codePoint >>> 16;
> return x != 0 && x < 0x11;
>
> (On modern hardware, these optimizations
> are less valuable than they used to be;
> ordinary integer arithmetic is almost free)
>
> Martin
>   


From kelly.ohair at sun.com  Wed Mar  3 19:30:18 2010
From: kelly.ohair at sun.com (kelly.ohair at sun.com)
Date: Wed, 03 Mar 2010 19:30:18 +0000
Subject: hg: jdk7/tl/jdk: 6931763: sanity checks broken with latest cygwin,
	newer egrep -i option problems
Message-ID: <20100303193038.0881343C6C@hg.openjdk.java.net>

Changeset: 507159d8d143
Author:    ohair
Date:      2010-03-03 11:29 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/507159d8d143

6931763: sanity checks broken with latest cygwin, newer egrep -i option problems
Reviewed-by: jjg

! make/common/shared/Sanity.gmk


From Ulf.Zibis at gmx.de  Wed Mar  3 21:31:07 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 03 Mar 2010 22:31:07 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>	
	<4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
Message-ID: <4B8ED51B.2030303@gmx.de>

Am 03.03.2010 17:06, schrieb Martin Buchholz:
> Sherman, would you like to file bugs for Ulf's improvements?
>    

Thanks.

> I admire your perfectionism.
>    

Really? :-)

> (On modern hardware, these optimizations
> are less valuable than they used to be;
> ordinary integer arithmetic is almost free)
>    

IMHO even on modern hardware half of machine code bytes should perform ~ 
twice. ;-)

-Ulf


From joe.darcy at sun.com  Thu Mar  4 00:05:47 2010
From: joe.darcy at sun.com (joe.darcy at sun.com)
Date: Thu, 04 Mar 2010 00:05:47 +0000
Subject: hg: jdk7/tl/langtools: 6449781: TypeElement.getQualifiedName for
	anonymous classes returns null instead of an empty name
Message-ID: <20100304000550.9A71443CAE@hg.openjdk.java.net>

Changeset: fc7132746501
Author:    darcy
Date:      2010-03-03 16:05 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/fc7132746501

6449781: TypeElement.getQualifiedName for anonymous classes returns null instead of an empty name
Reviewed-by: jjg

! src/share/classes/com/sun/tools/javac/jvm/ClassReader.java
+ test/tools/javac/processing/model/element/TestAnonClassNames.java
+ test/tools/javac/processing/model/element/TestAnonSourceNames.java


From jonathan.gibbons at sun.com  Thu Mar  4 01:23:54 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Thu, 04 Mar 2010 01:23:54 +0000
Subject: hg: jdk7/tl/langtools: 6931927: position issues with synthesized
	anonymous class
Message-ID: <20100304012357.898BF43CC4@hg.openjdk.java.net>

Changeset: 7f5db2e8b423
Author:    jjg
Date:      2010-03-03 17:22 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/7f5db2e8b423

6931927: position issues with synthesized anonymous class
Reviewed-by: darcy

! src/share/classes/com/sun/tools/javac/parser/JavacParser.java
+ test/tools/javac/tree/TestAnnotatedAnonClass.java
+ test/tools/javac/tree/TreePosTest.java
- test/tools/javac/treepostests/TreePosTest.java


From kevin.l.stern at gmail.com  Thu Mar  4 01:41:26 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Wed, 3 Mar 2010 19:41:26 -0600
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
Message-ID: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>

Greetings,

I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
java.io.ByteArrayOutputStream which arise when the capacities of the data
structures reach a particular threshold.  More below.

When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its size
reaches its capacity and an add or an insert operation is invoked, the
capacity is increased by only one element.  Notice that in the following
excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
oldCapacity + 1 unless this value would not suffice to accommodate the
required capacity in which case it is set to the required capacity.  If the
current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity *
3)/2 + 1 overflows and resolves to a negative number resulting in the new
capacity being set to the required capacity.  The major consequence of this
is that each subsequent add/insert operation results in a full resize of the
ArrayList causing performance to degrade significantly.

        int newCapacity = (oldCapacity * 3)/2 + 1;
            if (newCapacity < minCapacity)
        newCapacity = minCapacity;

Hashtable breaks entirely when the size of its backing array reaches (1/2) *
Integer.MAX_VALUE and a rehash is necessary as is evident from the following
excerpt from rehash.  Notice that rehash will attempt to create an array of
negative size if the size of the backing array reaches (1/2) *
Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
negative number.

    int newCapacity = oldCapacity * 2 + 1;
    HashtableEntry newTable[] = new HashtableEntry[newCapacity];

When the capacity of the backing array in a ByteArrayOutputStream reaches
(1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
operation is invoked, the capacity of the backing array is increased only by
the required number of elements.  Notice that in the following excerpt from
ByteArrayOutputStream.write(int) the new backing array capacity is set to 2
* buf.length unless this value would not suffice to accommodate the required
capacity in which case it is set to the required capacity.  If the current
backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
buf.length << 1 overflows and resolves to a negative number resulting in the
new capacity being set to the required capacity.  The major consequence of
this, like with ArrayList, is that each subsequent write operation results
in a full resize of the ByteArrayOutputStream causing performance to degrade
significantly.

    int newcount = count + 1;
    if (newcount > buf.length) {
            buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
    }

It is interesting to note that any statements about the amortized time
complexity of add/insert operations, such as the one in the ArrayList
javadoc, are invalidated by the performance related bugs.  One solution to
the above situations is to set the new capacity of the backing array to
Integer.MAX_VALUE when the initial size calculation results in a negative
number during a resize.

Apologies if these bugs are already known.

Regards,

Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100303/2acd1ff1/attachment.html>

From jason_mehrens at hotmail.com  Thu Mar  4 02:08:46 2010
From: jason_mehrens at hotmail.com (Jason Mehrens)
Date: Wed, 3 Mar 2010 20:08:46 -0600
Subject: Need reviewer for forward port of 6815768 (File.getXXXSpace)
	and	6815768 (String.hashCode)
In-Reply-To: <4B8A952B.1010007@gmx.de>
References: <4B86B8A4.8050709@sun.com>
	<4B86BF2E.8030208@sun.com>,<4B86CAC5.4050405@sun.com>
	<4B86DAE1.5050208@gmx.de>,<4B86DBDC.9090703@sun.com>
	<4B86F490.60704@sun.com>,<4B8A952B.1010007@gmx.de>
Message-ID: <SNT114-W49550CBCE9E2816A5DFCB783390@phx.gbl>


String.hash should only have two known states, zero and the actual computed hash code.

 
http://bugs.sun.com/view_bug.do?bug_id=6611830

 
Jason


> Date: Sun, 28 Feb 2010 17:09:15 +0100
> From: Ulf.Zibis at gmx.de
> To: Alan.Bateman at Sun.COM
> Subject: Re: Need reviewer for forward port of 6815768 (File.getXXXSpace) and 6815768 (String.hashCode)
> CC: core-libs-dev at openjdk.java.net; dmitry.nadezhin at gmail.com; Kelly.Ohair at Sun.COM
> 
> Another thought:
> 
> In the constructors of String we could initialize hash = 
> Integer.MIN_VALUE except if length == 0.
> Then we could stay at the fastest version:
> 
> public int hashCode() {
> int h = hash;
> if (h == Integer.MIN_VALUE) {
> h = 0;
> char[] val = value;
> for (int i = offset, limit = count + i; i != limit; )
> h = 31 * h + val[i++];
> hash = h;
> }
> return h;
> }
 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with Microsoft?s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/201469226/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100303/a7c668ba/attachment.html>

From weijun.wang at sun.com  Thu Mar  4 02:40:26 2010
From: weijun.wang at sun.com (weijun.wang at sun.com)
Date: Thu, 04 Mar 2010 02:40:26 +0000
Subject: hg: jdk7/tl/jdk: 3 new changesets
Message-ID: <20100304024122.AFDFC43CD7@hg.openjdk.java.net>

Changeset: 61c298558549
Author:    weijun
Date:      2010-03-04 10:37 +0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/61c298558549

6844909: support allow_weak_crypto in krb5.conf
Reviewed-by: valeriep

! src/share/classes/sun/security/krb5/internal/crypto/EType.java
+ test/sun/security/krb5/etype/WeakCrypto.java
+ test/sun/security/krb5/etype/weakcrypto.conf

Changeset: 0f383673ce31
Author:    weijun
Date:      2010-03-04 10:38 +0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/0f383673ce31

6923681: Jarsigner crashes during timestamping
Reviewed-by: vinnie

! src/share/classes/sun/security/tools/TimestampedSigner.java

Changeset: 5e15b70e6d27
Author:    weijun
Date:      2010-03-04 10:38 +0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/5e15b70e6d27

6880321: sun.security.provider.JavaKeyStore abuse of OOM Exception handling
Reviewed-by: xuelei

! src/share/classes/sun/security/provider/JavaKeyStore.java


From jonathan.gibbons at sun.com  Thu Mar  4 03:35:50 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Thu, 04 Mar 2010 03:35:50 +0000
Subject: hg: jdk7/tl/langtools: 6931126: jtreg tests not Windows friendly
Message-ID: <20100304033552.E838743CE5@hg.openjdk.java.net>

Changeset: 117c95448ab9
Author:    jjg
Date:      2010-03-03 19:34 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/117c95448ab9

6931126: jtreg tests not Windows friendly
Reviewed-by: darcy

! test/tools/javac/ThrowsIntersection_1.java
! test/tools/javac/ThrowsIntersection_2.java
! test/tools/javac/ThrowsIntersection_3.java
! test/tools/javac/ThrowsIntersection_4.java
! test/tools/javac/generics/NameOrder.java


From develop4lasu at gmail.com  Thu Mar  4 18:33:43 2010
From: develop4lasu at gmail.com (=?UTF-8?Q?Marek_Kozie=C5=82?=)
Date: Thu, 4 Mar 2010 19:33:43 +0100
Subject: Need reviewer for forward port of 6815768 (File.getXXXSpace) and 
	6815768 (String.hashCode)
In-Reply-To: <4B8A952B.1010007@gmx.de>
References: <4B86B8A4.8050709@sun.com> <4B86BF2E.8030208@sun.com>
	<4B86CAC5.4050405@sun.com> <4B86DAE1.5050208@gmx.de>
	<4B86DBDC.9090703@sun.com> <4B86F490.60704@sun.com>
	<4B8A952B.1010007@gmx.de>
Message-ID: <28bca0ff1003041033j2547fe95l80614ddf38799331@mail.gmail.com>

2010/2/28 Ulf Zibis <Ulf.Zibis at gmx.de>:
> Am 25.02.2010 23:07, schrieb Alan Bateman:
>>
>> Kelly O'Hair wrote:
>>>
>>> Yup. ?My eyes must be tired, I didn't see that. :^(
>>
>> Too many repositories in the air at the same time. The webrev has been
>> refreshed. Thanks Ulf.
>>
>>
>
> Another thought:
>
> In the constructors of String we could initialize hash = Integer.MIN_VALUE
> except if length == 0.
> Then we could stay at the fastest version:
>
> ? ?public int hashCode() {
> ? ? ? ?int h = hash;
> ? ? ? ?if (h == Integer.MIN_VALUE) {
> ? ? ? ? ? ?h = 0;
> ? ? ? ? ? ?char[] val = value;
> ? ? ? ? ? ?for (int i = offset, limit = count + i; i != limit; )
> ? ? ? ? ? ? ? ?h = 31 * h + val[i++];
> ? ? ? ? ? ?hash = h;
> ? ? ? ?}
> ? ? ? ?return h;
> ? ?}
>
> As an alternative we could use:
> private static final int UNKNOWN_HASH = 1;
> Justification:
> Using a small value results in little shorter byte code and machine code
> footprint after compilation.
> Additionally on some CPU's this likely will perform little better, but never
> worse.
>
> Please note:
> Original loop causes 2 values to increment:
> ? ? ? ? ? ?for (int i = 0; i < len; i++) {
> ? ? ? ? ? ? ? ?h = 31*h + val[off++];
> ? ? ? ? ? ?}
> This is inefficient as I have proved in a little micro-benchmark.
>
> -Ulf
>
>
>
>

Hello,
I would suggest:
public int hashCode() {
 ? ? ? ?int h = hash;
       if (h == 0) {
           h = 0;
           char[] val = value;
           for (int i = offset, limit = count + i; i != limit; )
               h = 31 * h + val[i++];
           if (h == 0)
               h++;
           hash = h;
       }
       return h;
   }


But personally I would consider:
1. make hash long
2. change method of it's  generation to ensure that:
  -- in most cases String.concat(...) would be able to determine new
hash from substring hashes so it would be available to set it in
constructor always (with little effort it's possible now).
  -- would contains flag (bit) that would tell us if hash is bijection

 public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;

                if (hash!=anotherString.hash) return false;
                if (hash&isHashBijection!=0) return true;

        int n = count;
        if (n == anotherString.count) {
        char v1[] = value;
        char v2[] = anotherString.value;
        int i = offset;
        int j = anotherString.offset;
        while (n-- != 0) {
            if (v1[i++] != v2[j++])
            return false;
        }
        return true;
        }
    }
    return false;
    }

As you know this would require a lot of work and probably it's not
worth it's effect.


Notice one more thing if we would be able to knew if String is in
intern version, equal could look like:

public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        if (isIntern() && anotherString.isIntern()) return false;// we
checked it at first line already

        int n = count;
        if (n == anotherString.count) {
        char v1[] = value;
        char v2[] = anotherString.value;
        int i = offset;
        int j = anotherString.offset;
        while (n-- != 0) {
            if (v1[i++] != v2[j++])
            return false;
        }
        return true;
        }
    }
    return false;
    }

This solution would powdered .intern() so once someone would optimise
application and use interns it would improve speed and memory usage,
also it do not have negative impact like calculating hash in
constructor, the problem is where this information should be stored?
(I have some idea about it but I doubt if this would be accepted)


-- 
Pozdrowionka. / Regards.
Lasu aka Marek Kozie?

http://lasu2string.blogspot.com/


From Ulf.Zibis at gmx.de  Thu Mar  4 20:04:24 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 04 Mar 2010 21:04:24 +0100
Subject: Tune String's hashCode() + equals() [was: Need reviewer for
	forward
	port of 6815768 (File.getXXXSpace) and 6815768 (String.hashCode)]
In-Reply-To: <28bca0ff1003041033j2547fe95l80614ddf38799331@mail.gmail.com>
References: <4B86B8A4.8050709@sun.com>
	<4B86BF2E.8030208@sun.com>	<4B86CAC5.4050405@sun.com>
	<4B86DAE1.5050208@gmx.de>	<4B86DBDC.9090703@sun.com>
	<4B86F490.60704@sun.com>	<4B8A952B.1010007@gmx.de>
	<28bca0ff1003041033j2547fe95l80614ddf38799331@mail.gmail.com>
Message-ID: <4B901248.8020408@gmx.de>

Am 04.03.2010 19:33, schrieb Marek Kozie?:
> Hello,
> I would suggest:
> public int hashCode() {
>          int h = hash;
>         if (h == 0) {
>             h = 0;
>             char[] val = value;
>             for (int i = offset, limit = count + i; i != limit; )
>                 h = 31 * h + val[i++];
>             if (h == 0)
>                 h++;
>             hash = h;
>         }
>         return h;
>     }
>
>    

Intersting alternative, but I'm afraid, this is against the spec.
Shifting all 0's to 1 would break String's hash definition:  h = 31 * h 
+ val[i++].


> But personally I would consider:
> 1. make hash long
> 2. change method of it's  generation to ensure that:
>    -- in most cases String.concat(...) would be able to determine new
> hash from substring hashes so it would be available to set it in
> constructor always (with little effort it's possible now).
>    -- would contains flag (bit) that would tell us if hash is bijection
>
>   public boolean equals(Object anObject) {
>      if (this == anObject) {
>          return true;
>      }
>      if (anObject instanceof String) {
>          String anotherString = (String)anObject;
>
>                  if (hash!=anotherString.hash) return false;
>    

only valid if (hash != 0 && anotherString.hash != 0)


>                  if (hash&isHashBijection!=0) return true;
>    

Which integer value should isHashBijection have ?

>          int n = count;
>          if (n == anotherString.count) {
>          char v1[] = value;
>          char v2[] = anotherString.value;
>          int i = offset;
>          int j = anotherString.offset;
>          while (n-- != 0) {
>              if (v1[i++] != v2[j++])
>              return false;
>          }
>          return true;
>          }
>      }
>      return false;
>      }
>
> As you know this would require a lot of work and probably it's not
> worth it's effect.
>
>
> Notice one more thing if we would be able to knew if String is in
> intern version, equal could look like:
>
> public boolean equals(Object anObject) {
>      if (this == anObject) {
>          return true;
>      }
>      if (anObject instanceof String) {
>          String anotherString = (String)anObject;
>          if (isIntern()&&  anotherString.isIntern()) return false;// we
> checked it at first line already
>
>          int n = count;
>          if (n == anotherString.count) {
>          char v1[] = value;
>          char v2[] = anotherString.value;
>          int i = offset;
>          int j = anotherString.offset;
>          while (n-- != 0) {
>              if (v1[i++] != v2[j++])
>              return false;
>          }
>          return true;
>          }
>      }
>      return false;
>      }
>    

Interned strings have their hashes already computed to organize them in 
internal hash map. Unfortunately those hashes are not back-propagated to 
the Java object, so equals() can't benefit from them for now.

-Ulf


From develop4lasu at gmail.com  Thu Mar  4 20:31:20 2010
From: develop4lasu at gmail.com (=?UTF-8?Q?Marek_Kozie=C5=82?=)
Date: Thu, 4 Mar 2010 21:31:20 +0100
Subject: Tune String's hashCode() + equals() [was: Need reviewer for 
	forward port of 6815768 (File.getXXXSpace) and 6815768
	(String.hashCode)]
In-Reply-To: <4B901248.8020408@gmx.de>
References: <4B86B8A4.8050709@sun.com> <4B86BF2E.8030208@sun.com>
	<4B86CAC5.4050405@sun.com> <4B86DAE1.5050208@gmx.de>
	<4B86DBDC.9090703@sun.com> <4B86F490.60704@sun.com>
	<4B8A952B.1010007@gmx.de>
	<28bca0ff1003041033j2547fe95l80614ddf38799331@mail.gmail.com>
	<4B901248.8020408@gmx.de>
Message-ID: <28bca0ff1003041231n691bd44difcced567be841cd8@mail.gmail.com>

@Ulf
Few explanations:
1.
> Intersting alternative, but I'm afraid, this is against the spec.
> Shifting all 0's to 1 would break String's hash definition:  h = 31 * h + val[i++].
Yes it does, any way i think spec is to tight here. Do we really need
hash of each value even if String have length like 600000?
there is noting good coming from it in my opinion.
Did any one saw at least one code relaying on that ?
Btw. Same come with .compare

2.
private static final long isHashBijection= 0x8000000000000000L;
should be fine

3.
Second sample would work only if hash would be set in constructor so
even 0 would be valid hash.

4.
Maybe I'm wrong but if most cases when hash should be count is String
concatenation then we could make it +/-
> a.hash+b.hash*powderOf31[a.length()]
so it would not consume so much time.

5.
Intern string do not need hash codes co comparing cos they have same
address, so first loop would return true if they are equal, after this
we need only to check if they are not equal:
> if (isIntern() && anotherString.isIntern()) return false;

6.
Have in mind that adding additional fields to string might be not an
option, because memory lost this way may have great impact on average
application efficiency.
-- 
Pozdrowionka. / Regards.
Lasu aka Marek Kozie?

http://lasu2string.blogspot.com/


From Ulf.Zibis at gmx.de  Thu Mar  4 21:10:18 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 04 Mar 2010 22:10:18 +0100
Subject: Need reviewer for forward port of 6815768 (File.getXXXSpace)
	and	6815768 (String.hashCode)
In-Reply-To: <SNT114-W49550CBCE9E2816A5DFCB783390@phx.gbl>
References: <4B86B8A4.8050709@sun.com>	<4B86BF2E.8030208@sun.com>,
	<4B86CAC5.4050405@sun.com>	<4B86DAE1.5050208@gmx.de>,
	<4B86DBDC.9090703@sun.com>	<4B86F490.60704@sun.com>,
	<4B8A952B.1010007@gmx.de> <SNT114-W49550CBCE9E2816A5DFCB783390@phx.gbl>
Message-ID: <4B9021BA.3060800@gmx.de>

Am 04.03.2010 03:08, schrieb Jason Mehrens:
> String.hash should only have two known states, zero and the actual 
> computed hash code.
>
> http://bugs.sun.com/view_bug.do?bug_id=6611830

I far theory yes.
But have you read the evaluation ?
"This bug pattern is endemic in the JDK sources."

-Ulf


>
> Jason
>
>
> > Date: Sun, 28 Feb 2010 17:09:15 +0100
> > From: Ulf.Zibis at gmx.de
> > To: Alan.Bateman at Sun.COM
> > Subject: Re: Need reviewer for forward port of 6815768 
> (File.getXXXSpace) and 6815768 (String.hashCode)
> > CC: core-libs-dev at openjdk.java.net; dmitry.nadezhin at gmail.com; 
> Kelly.Ohair at Sun.COM
> >
> > Another thought:
> >
> > In the constructors of String we could initialize hash =
> > Integer.MIN_VALUE except if length == 0.
> > Then we could stay at the fastest version:
> >
> > public int hashCode() {
> > int h = hash;
> > if (h == Integer.MIN_VALUE) {
> > h = 0;
> > char[] val = value;
> > for (int i = offset, limit = count + i; i != limit; )
> > h = 31 * h + val[i++];
> > hash = h;
> > }
> > return h;
> > }
>
> ------------------------------------------------------------------------
> Hotmail: Trusted email with Microsoft?s powerful SPAM protection. Sign 
> up now. <http://clk.atdmt.com/GBL/go/201469226/direct/01/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100304/c43d777e/attachment.html>

From Ulf.Zibis at gmx.de  Thu Mar  4 21:36:05 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 04 Mar 2010 22:36:05 +0100
Subject: Tune String's hashCode() + equals()
In-Reply-To: <28bca0ff1003041231n691bd44difcced567be841cd8@mail.gmail.com>
References: <4B86B8A4.8050709@sun.com> <4B86BF2E.8030208@sun.com>	
	<4B86CAC5.4050405@sun.com> <4B86DAE1.5050208@gmx.de>	
	<4B86DBDC.9090703@sun.com> <4B86F490.60704@sun.com>	
	<4B8A952B.1010007@gmx.de>	
	<28bca0ff1003041033j2547fe95l80614ddf38799331@mail.gmail.com>	
	<4B901248.8020408@gmx.de>
	<28bca0ff1003041231n691bd44difcced567be841cd8@mail.gmail.com>
Message-ID: <4B9027C5.6030304@gmx.de>

Much thanks for your effort.

Am 04.03.2010 21:31, schrieb Marek Kozie?:
> @Ulf
> Few explanations:
> 1.
>    
>> Intersting alternative, but I'm afraid, this is against the spec.
>> Shifting all 0's to 1 would break String's hash definition:  h = 31 * h + val[i++].
>>      
> Yes it does, any way i think spec is to tight here. Do we really need
> hash of each value even if String have length like 600000?
> there is noting good coming from it in my opinion.
> Did any one saw at least one code relaying on that ?
> Btw. Same come with .compare
>    

See discussion on project Coin list, subject "Benefit from computing 
String Hash at compile time?"

> 2.
> private static final long isHashBijection= 0x8000000000000000L;
> should be fine
>    

Now I understand how it should work.
Your algorithm will guarantee only 2^63 bijectional values. So strings 
of lenght >= 4 can't have bijectional hashes as 4 chars count 2^64 
variations.

> 3.
> Second sample would work only if hash would be set in constructor so
> even 0 would be valid hash.
>
> 4.
> Maybe I'm wrong but if most cases when hash should be count is String
> concatenation then we could make it +/-
>    
>> a.hash+b.hash*powderOf31[a.length()]
>>      
> so it would not consume so much time.
>    

Interesting idea. I suggest to file an RFE.

> 5.
> Intern string do not need hash codes co comparing cos they have same
> address, so first loop would return true if they are equal, after this
> we need only to check if they are not equal:
>    
>> if (isIntern()&&  anotherString.isIntern()) return false;
>>      

You are right, but
     if (h1 != 0 && h2 != 0 && h1 != h2) return false;
would perform same (if already computed internal hash would be 
back-propagated to the Java object).

> 6.
> Have in mind that adding additional fields to string might be not an
> option, because memory lost this way may have great impact on average
> application efficiency.
>    

+ would break compatibility if objects are serialized.

-Ulf


From develop4lasu at gmail.com  Thu Mar  4 22:38:52 2010
From: develop4lasu at gmail.com (=?UTF-8?Q?Marek_Kozie=C5=82?=)
Date: Thu, 4 Mar 2010 23:38:52 +0100
Subject: Tune String's hashCode() + equals()
In-Reply-To: <4B9027C5.6030304@gmx.de>
References: <4B86B8A4.8050709@sun.com> <4B86CAC5.4050405@sun.com>
	<4B86DAE1.5050208@gmx.de> <4B86DBDC.9090703@sun.com>
	<4B86F490.60704@sun.com> <4B8A952B.1010007@gmx.de>
	<28bca0ff1003041033j2547fe95l80614ddf38799331@mail.gmail.com>
	<4B901248.8020408@gmx.de>
	<28bca0ff1003041231n691bd44difcced567be841cd8@mail.gmail.com>
	<4B9027C5.6030304@gmx.de>
Message-ID: <28bca0ff1003041438m75c87d0fk30b3e1e097e1efb4@mail.gmail.com>

2010/3/4 Ulf Zibis <Ulf.Zibis at gmx.de>:

>> 5.
>> Intern string do not need hash codes co comparing cos they have same
>> address, so first loop would return true if they are equal, after this
>> we need only to check if they are not equal:
>>
>>>
>>> if (isIntern()&& ?anotherString.isIntern()) return false;
>>>
>
> You are right, but
> ? ?if (h1 != 0 && h2 != 0 && h1 != h2) return false;
> would perform same (if already computed internal hash would be
> back-propagated to the Java object).
>

Could you explain what do you mean ?


If u search for optimization i suggest (if it's not already partially
implemented):
add :
public static String String.intern(String str, int waste)
which would work like String.intern(String str) except that if in
intern table there is already 'other' String that:
str.startWith(other) && other.length()<str.length()+waste then new
intern String would use same char array as other String.

If i'm not wrong this should free around 40% memory consumed by Strings

also could help a lot
String String.intern(String str, String asPartOfString)

I prefer this solution because 10Mb of memory may mean more data
stored in Memory instead of database which can bring 10000% more
benefits than toying with if.

-- 
Pozdrowionka. / Regards.
Lasu aka Marek Kozie?

http://lasu2string.blogspot.com/


From Ulf.Zibis at gmx.de  Thu Mar  4 23:28:44 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 05 Mar 2010 00:28:44 +0100
Subject: Tune String's hashCode() + equals()
In-Reply-To: <28bca0ff1003041438m75c87d0fk30b3e1e097e1efb4@mail.gmail.com>
References: <4B86B8A4.8050709@sun.com> <4B86CAC5.4050405@sun.com>	
	<4B86DAE1.5050208@gmx.de> <4B86DBDC.9090703@sun.com>	
	<4B86F490.60704@sun.com> <4B8A952B.1010007@gmx.de>	
	<28bca0ff1003041033j2547fe95l80614ddf38799331@mail.gmail.com>	
	<4B901248.8020408@gmx.de>	
	<28bca0ff1003041231n691bd44difcced567be841cd8@mail.gmail.com>	
	<4B9027C5.6030304@gmx.de>
	<28bca0ff1003041438m75c87d0fk30b3e1e097e1efb4@mail.gmail.com>
Message-ID: <4B90422C.6070705@gmx.de>

Am 04.03.2010 23:38, schrieb Marek Kozie?:
> 2010/3/4 Ulf Zibis<Ulf.Zibis at gmx.de>:
>
>    
>>> 5.
>>> Intern string do not need hash codes co comparing cos they have same
>>> address, so first loop would return true if they are equal, after this
>>> we need only to check if they are not equal:
>>>
>>>        
>>>> if (isIntern()&&    anotherString.isIntern()) return false;
>>>>
>>>>          
>> You are right, but
>>     if (h1 != 0&&  h2 != 0&&  h1 != h2) return false;
>> would perform same (if already computed internal hash would be
>> back-propagated to the Java object).
>>
>>      
> Could you explain what do you mean ?
>    

h1 = this.hash;
h2 = otherString.hash;

See:
In hotspot/src/share/vm/prims/jvm.cpp :
JVM_ENTRY(jstring, JVM_InternString(JNIEnv *env, jstring str))
   JVMWrapper("JVM_InternString");
   JvmtiVMObjectAllocEventCollector oam;
   if (str == NULL) return NULL;
   oop string = JNIHandles::resolve_non_null(str);
   oop result = StringTable::intern(string, CHECK_NULL);
   return (jstring) JNIHandles::make_local(env, result);
JVM_END

In hotspot/src/share/vm/classfile/symbolTable.cpp :

oop StringTable::intern(Handle string_or_null, jchar* name,
                         int len, TRAPS) {
   unsigned int hashValue = hash_string(name, len);
   int index = the_table()->hash_to_index(hashValue);
   oop string = the_table()->lookup(index, name, len, hashValue);

   // Found
   if (string != NULL) return string;

   // Otherwise, add to symbol to table
   return the_table()->basic_add(index, string_or_null, name, len,
                                 hashValue, CHECK_NULL);
}

int StringTable::hash_string(jchar* s, int len) {
   unsigned h = 0;
   for (len = s + len*sizeof(jchar); s < len; s++)
     h = 31*h + (unsigned) *s;
   return h;
}


From lana.steuck at sun.com  Fri Mar  5 06:46:13 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 05 Mar 2010 06:46:13 +0000
Subject: hg: jdk7/tl: Added tag jdk7-b84 for changeset 2f3ea057d1ad
Message-ID: <20100305064613.4D93E43E6F@hg.openjdk.java.net>

Changeset: cf26288a114b
Author:    mikejwre
Date:      2010-02-18 13:31 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/cf26288a114b

Added tag jdk7-b84 for changeset 2f3ea057d1ad

! .hgtags


From lana.steuck at sun.com  Fri Mar  5 06:46:19 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 05 Mar 2010 06:46:19 +0000
Subject: hg: jdk7/tl/corba: Added tag jdk7-b84 for changeset 68c8961a82e4
Message-ID: <20100305064620.B3DA043E70@hg.openjdk.java.net>

Changeset: c67a9df7bc0c
Author:    mikejwre
Date:      2010-02-18 13:31 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/corba/rev/c67a9df7bc0c

Added tag jdk7-b84 for changeset 68c8961a82e4

! .hgtags


From lana.steuck at sun.com  Fri Mar  5 06:48:40 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 05 Mar 2010 06:48:40 +0000
Subject: hg: jdk7/tl/hotspot: 27 new changesets
Message-ID: <20100305065004.1989143E71@hg.openjdk.java.net>

Changeset: 125eb6a9fccf
Author:    mikejwre
Date:      2010-02-18 13:31 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/125eb6a9fccf

Added tag jdk7-b84 for changeset ffc8d176b84b

! .hgtags

Changeset: 745c853ee57f
Author:    johnc
Date:      2010-01-29 14:51 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/745c853ee57f

6885297: java -XX:RefDiscoveryPolicy=2 or -XX:TLABWasteTargetPercent=0 cause VM crash
Summary: Interval checking is now being performed on the values passed in for these two flags. The current acceptable range for RefDiscoveryPolicy is [0..1], and for TLABWasteTargetPercent it is [1..100].
Reviewed-by: apetrusenko, ysr

! src/share/vm/includeDB_core
! src/share/vm/memory/referenceProcessor.hpp
! src/share/vm/runtime/arguments.cpp
! src/share/vm/runtime/arguments.hpp

Changeset: 6484c4ee11cb
Author:    ysr
Date:      2010-02-01 17:29 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/6484c4ee11cb

6904516: More object array barrier fixes, following up on 6906727
Summary: Fixed missing pre-barrier calls for G1, modified C1 to call pre- and correct post-barrier interfaces, deleted obsolete interface, (temporarily) disabled redundant deferred barrier in BacktraceBuilder.
Reviewed-by: coleenp, jmasa, kvn, never

! src/share/vm/c1/c1_Runtime1.cpp
! src/share/vm/classfile/javaClasses.cpp
! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp
! src/share/vm/memory/barrierSet.hpp
! src/share/vm/memory/barrierSet.inline.hpp
! src/share/vm/runtime/stubRoutines.cpp

Changeset: deada8912c54
Author:    johnc
Date:      2010-02-02 18:39 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/deada8912c54

6914402: G1: assert(!is_young_card(cached_ptr),"shouldn't get a card in young region")
Summary: Invalid assert. Filter cards evicted from the card count cache instead.
Reviewed-by: apetrusenko, tonyp

! src/share/vm/gc_implementation/g1/concurrentG1Refine.cpp

Changeset: 230fac611b50
Author:    johnc
Date:      2010-02-08 09:58 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/230fac611b50

Merge

! src/share/vm/c1/c1_Runtime1.cpp
! src/share/vm/includeDB_core

Changeset: 455df1b81409
Author:    kamg
Date:      2010-02-08 13:49 -0500
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/455df1b81409

6587322: dtrace probe object__alloc doesn't fire in some situations on amd64
Summary: Fix misplaced probe point
Reviewed-by: rasbold, phh
Contributed-by: neojia at gmail.com

! src/cpu/x86/vm/templateTable_x86_64.cpp

Changeset: 95d21201c29a
Author:    apangin
Date:      2010-02-11 10:48 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/95d21201c29a

Merge


Changeset: 3f5b7efb9642
Author:    never
Date:      2010-02-05 11:07 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/3f5b7efb9642

6920293: OptimizeStringConcat causing core dumps
Reviewed-by: kvn, twisti

! src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp
! src/share/vm/code/nmethod.cpp
! src/share/vm/opto/stringopts.cpp
! src/share/vm/runtime/sharedRuntime.cpp

Changeset: 576e77447e3c
Author:    kvn
Date:      2010-02-07 12:15 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/576e77447e3c

6923002: assert(false,"this call site should not be polymorphic")
Summary: Clear the total count when a receiver information is cleared.
Reviewed-by: never, jrose

! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp
! src/cpu/sparc/vm/interp_masm_sparc.cpp
! src/cpu/sparc/vm/sharedRuntime_sparc.cpp
! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp
! src/cpu/x86/vm/interp_masm_x86_32.cpp
! src/cpu/x86/vm/interp_masm_x86_64.cpp
! src/share/vm/ci/ciMethod.cpp
! src/share/vm/oops/methodDataOop.hpp
! src/share/vm/opto/doCall.cpp
! src/share/vm/opto/runtime.cpp
! src/share/vm/runtime/arguments.cpp

Changeset: f516d5d7a019
Author:    kvn
Date:      2010-02-08 12:20 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/f516d5d7a019

6910605: C2: NullPointerException/ClassCaseException is thrown when C2 with DeoptimizeALot is used
Summary: Set the reexecute bit for runtime calls _new_array_Java when they used for _multianewarray bytecode.
Reviewed-by: never

! src/share/vm/code/pcDesc.cpp
! src/share/vm/opto/graphKit.cpp
! src/share/vm/opto/parse3.cpp
+ test/compiler/6910605/Test.java

Changeset: f70b0d9ab095
Author:    kvn
Date:      2010-02-09 01:31 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/f70b0d9ab095

6910618: C2: Error: assert(d->is_oop(),"JVM_ArrayCopy: dst not an oop")
Summary: Mark in PcDesc call sites which return oop and save the result oop across objects reallocation during deoptimization.
Reviewed-by: never

! src/share/vm/c1/c1_IR.hpp
! src/share/vm/code/debugInfoRec.cpp
! src/share/vm/code/debugInfoRec.hpp
! src/share/vm/code/nmethod.cpp
! src/share/vm/code/pcDesc.hpp
! src/share/vm/code/scopeDesc.cpp
! src/share/vm/code/scopeDesc.hpp
! src/share/vm/includeDB_core
! src/share/vm/opto/output.cpp
! src/share/vm/prims/jvmtiCodeBlobEvents.cpp
! src/share/vm/runtime/deoptimization.cpp
+ test/compiler/6910618/Test.java

Changeset: 4ee1c645110e
Author:    kvn
Date:      2010-02-09 10:21 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/4ee1c645110e

6924097: assert((_type == Type::MEMORY) == (_adr_type != 0),"adr_type for memory phis only")
Summary: Use PhiNode::make_blank(r, n) method to construct the phi.
Reviewed-by: never

! src/share/vm/opto/loopopts.cpp

Changeset: e3a4305c6bc3
Author:    kvn
Date:      2010-02-12 08:54 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/e3a4305c6bc3

6925249: assert(last_sp < (intptr_t*) interpreter_frame_monitor_begin(),"bad tos")
Summary: Fix assert since top deoptimized frame has last_sp == interpreter_frame_monitor_begin if there are no expressions.
Reviewed-by: twisti

! src/cpu/x86/vm/frame_x86.inline.hpp
! src/share/vm/runtime/deoptimization.cpp
! src/share/vm/runtime/frame.cpp
! src/share/vm/runtime/vframeArray.cpp

Changeset: c09ee209b65c
Author:    kvn
Date:      2010-02-12 10:34 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/c09ee209b65c

6926048: Improve Zero performance
Summary: Make Zero figure out result types in a similar way to C++ interpreter implementation.
Reviewed-by: kvn
Contributed-by: gbenson at redhat.com

! src/cpu/zero/vm/cppInterpreter_zero.cpp
! src/cpu/zero/vm/cppInterpreter_zero.hpp

Changeset: 7b4415a18c8a
Author:    kvn
Date:      2010-02-12 15:27 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/7b4415a18c8a

Merge

! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp
! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp
! src/share/vm/includeDB_core
! src/share/vm/opto/graphKit.cpp
! src/share/vm/opto/runtime.cpp
! src/share/vm/runtime/arguments.cpp
! src/share/vm/runtime/sharedRuntime.cpp

Changeset: 38836cf1d8d2
Author:    tonyp
Date:      2010-02-05 11:05 -0500
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/38836cf1d8d2

6920977: G1: guarantee(k == probe->klass(),"klass should be in dictionary") fails
Summary: the guarantee is too strict and the test will fail (incorrectly) if the class is not in the system dictionary but in the placeholders.
Reviewed-by: acorn, phh

! src/share/vm/classfile/loaderConstraints.cpp
! src/share/vm/classfile/loaderConstraints.hpp
! src/share/vm/classfile/systemDictionary.cpp
! src/share/vm/includeDB_core

Changeset: 9eee977dd1a9
Author:    tonyp
Date:      2010-02-08 14:23 -0500
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/9eee977dd1a9

6802453: G1: hr()->is_in_reserved(from),"Precondition."
Summary: The operations of re-using a RSet component and expanding the same RSet component were not mutually exlusive, and this could lead to RSets getting corrupted and entries being dropped.
Reviewed-by: iveresov, johnc

! src/share/vm/gc_implementation/g1/heapRegionRemSet.cpp

Changeset: 8859772195c6
Author:    johnc
Date:      2010-02-09 13:56 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/8859772195c6

6782663: Data produced by PrintGCApplicationConcurrentTime and PrintGCApplicationStoppedTime is not accurate.
Summary: Update and display the timers associated with these flags for all safepoints.
Reviewed-by: ysr, jcoomes

! src/share/vm/runtime/vmThread.cpp
! src/share/vm/services/runtimeService.cpp

Changeset: 0414c1049f15
Author:    iveresov
Date:      2010-02-11 15:52 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/0414c1049f15

6923991: G1: improve scalability of RSet scanning
Summary: Implemented block-based work stealing. Moved copying during the rset scanning phase to the main copying phase. Made the size of rset table depend on the region size.
Reviewed-by: apetrusenko, tonyp

! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp
! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp
! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp
! src/share/vm/gc_implementation/g1/g1OopClosures.hpp
! src/share/vm/gc_implementation/g1/g1OopClosures.inline.hpp
! src/share/vm/gc_implementation/g1/g1RemSet.cpp
! src/share/vm/gc_implementation/g1/g1_globals.hpp
! src/share/vm/gc_implementation/g1/g1_specialized_oop_closures.hpp
! src/share/vm/gc_implementation/g1/heapRegionRemSet.cpp
! src/share/vm/gc_implementation/g1/heapRegionRemSet.hpp
! src/share/vm/gc_implementation/g1/sparsePRT.cpp
! src/share/vm/gc_implementation/g1/sparsePRT.hpp
! src/share/vm/memory/cardTableModRefBS.hpp
! src/share/vm/utilities/globalDefinitions.hpp

Changeset: 58add740c4ee
Author:    johnc
Date:      2010-02-16 14:11 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/58add740c4ee

Merge

! src/share/vm/includeDB_core

Changeset: e7b1cc79bd25
Author:    kvn
Date:      2010-02-16 16:17 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/e7b1cc79bd25

6926697: "optimized" VM build failed: The type "AdapterHandlerTableIterator" is incomplete
Summary: Define AdapterHandlerTableIterator class as non product instead of debug.
Reviewed-by: never

! src/share/vm/runtime/sharedRuntime.cpp

Changeset: 106f41e88c85
Author:    never
Date:      2010-02-16 20:07 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/106f41e88c85

6877221: Endless deoptimizations in OSR nmethod
Reviewed-by: kvn

! src/share/vm/opto/parse1.cpp

Changeset: b4b440360f1e
Author:    twisti
Date:      2010-02-18 11:35 +0100
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/b4b440360f1e

6926782: CodeBuffer size too small after 6921352
Summary: After 6921352 the CodeBuffer size was too small.
Reviewed-by: kvn, never

! src/share/vm/opto/callGenerator.cpp
! src/share/vm/opto/compile.cpp
! src/share/vm/opto/compile.hpp
! src/share/vm/opto/output.cpp

Changeset: 3b687c53c266
Author:    twisti
Date:      2010-02-18 06:54 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/3b687c53c266

6927165: Zero S/390 fixes
Summary: Fixes two failures on 31-bit S/390.
Reviewed-by: twisti
Contributed-by: Gary Benson <gbenson at redhat.com>

! src/cpu/zero/vm/globals_zero.hpp
! src/os_cpu/linux_zero/vm/os_linux_zero.hpp

Changeset: 72f1840531a4
Author:    twisti
Date:      2010-02-18 10:44 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/72f1840531a4

Merge


Changeset: 1f341bb67b5b
Author:    trims
Date:      2010-02-18 22:15 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/1f341bb67b5b

Merge


Changeset: 6c9796468b91
Author:    trims
Date:      2010-02-18 22:16 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/6c9796468b91

6927886: Bump the HS17 build number to 10
Summary: Update the HS17 build number to 10
Reviewed-by: jcoomes

! make/hotspot_version


From lana.steuck at sun.com  Fri Mar  5 06:53:59 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 05 Mar 2010 06:53:59 +0000
Subject: hg: jdk7/tl/jaxp: Added tag jdk7-b84 for changeset 32c0cf01d555
Message-ID: <20100305065359.9313F43E73@hg.openjdk.java.net>

Changeset: 6c0ccabb430d
Author:    mikejwre
Date:      2010-02-18 13:31 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jaxp/rev/6c0ccabb430d

Added tag jdk7-b84 for changeset 32c0cf01d555

! .hgtags


From lana.steuck at sun.com  Fri Mar  5 06:54:06 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 05 Mar 2010 06:54:06 +0000
Subject: hg: jdk7/tl/jaxws: Added tag jdk7-b84 for changeset 8bc02839eee4
Message-ID: <20100305065406.7BDCD43E74@hg.openjdk.java.net>

Changeset: 8424512588ff
Author:    mikejwre
Date:      2010-02-18 13:31 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jaxws/rev/8424512588ff

Added tag jdk7-b84 for changeset 8bc02839eee4

! .hgtags


From lana.steuck at sun.com  Fri Mar  5 06:54:39 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 05 Mar 2010 06:54:39 +0000
Subject: hg: jdk7/tl/jdk: 5 new changesets
Message-ID: <20100305065612.5781943E77@hg.openjdk.java.net>

Changeset: a9b4fde406d4
Author:    mikejwre
Date:      2010-02-18 13:31 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/a9b4fde406d4

Added tag jdk7-b84 for changeset 7cb9388bb1a1

! .hgtags

Changeset: 2ba381560071
Author:    dcherepanov
Date:      2010-02-12 19:58 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/2ba381560071

6705345: Enable multiple file selection in AWT FileDialog
Reviewed-by: art, anthony, alexp

! src/share/classes/java/awt/FileDialog.java
! src/share/classes/sun/awt/AWTAccessor.java
! src/solaris/classes/sun/awt/X11/XFileDialogPeer.java
! src/windows/classes/sun/awt/windows/WFileDialogPeer.java
! src/windows/native/sun/windows/awt_FileDialog.cpp
! src/windows/native/sun/windows/awt_FileDialog.h
+ test/java/awt/FileDialog/MultipleMode/MultipleMode.html
+ test/java/awt/FileDialog/MultipleMode/MultipleMode.java

Changeset: d6d2de6ee2d1
Author:    lana
Date:      2010-02-19 15:13 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/d6d2de6ee2d1

Merge


Changeset: b396584a3e64
Author:    lana
Date:      2010-02-23 10:17 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/b396584a3e64

Merge

- make/java/text/FILES_java.gmk

Changeset: c2d29e5695c2
Author:    lana
Date:      2010-03-04 13:40 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/c2d29e5695c2

Merge


From lana.steuck at sun.com  Fri Mar  5 07:03:30 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 05 Mar 2010 07:03:30 +0000
Subject: hg: jdk7/tl/langtools: 3 new changesets
Message-ID: <20100305070338.B02E543E79@hg.openjdk.java.net>

Changeset: 75d5bd12eb86
Author:    mikejwre
Date:      2010-02-18 13:31 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/75d5bd12eb86

Added tag jdk7-b84 for changeset d9cd5b8286e4

! .hgtags

Changeset: 136bfc679462
Author:    lana
Date:      2010-02-23 10:17 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/136bfc679462

Merge


Changeset: c55733ceed61
Author:    lana
Date:      2010-03-04 13:40 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/c55733ceed61

Merge


From martinrb at google.com  Fri Mar  5 09:04:32 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 5 Mar 2010 01:04:32 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
Message-ID: <1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>

Hi Kevin,

As you've noticed, creating objects within a factor of two of
their natural limits is a good way to expose lurking bugs.

I'm the one responsible for the algorithm in ArrayList.
I'm a bit embarrassed, looking at that code today.
We could set the array size to Integer.MAX_VALUE,
but then you might hit an independent buglet in hotspot
that you cannot allocate an array with Integer.MAX_VALUE
elements, but Integer.MAX_VALUE - 5 (or so) works.

It occurs to me that increasing the size by 50% is better done by
int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;

I agree with the plan of setting the capacity to something near
MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.

These bugs are not known.
Chris Hegarty, could you file a bug for us?

Martin

On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> Greetings,
>
> I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
> java.io.ByteArrayOutputStream which arise when the capacities of the data
> structures reach a particular threshold.? More below.
>
> When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its size
> reaches its capacity and an add or an insert operation is invoked, the
> capacity is increased by only one element.? Notice that in the following
> excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
> oldCapacity + 1 unless this value would not suffice to accommodate the
> required capacity in which case it is set to the required capacity.? If the
> current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity *
> 3)/2 + 1 overflows and resolves to a negative number resulting in the new
> capacity being set to the required capacity.? The major consequence of this
> is that each subsequent add/insert operation results in a full resize of the
> ArrayList causing performance to degrade significantly.
>
> ??? ??? int newCapacity = (oldCapacity * 3)/2 + 1;
> ??? ??? ??? if (newCapacity < minCapacity)
> ??? ??? newCapacity = minCapacity;
>
> Hashtable breaks entirely when the size of its backing array reaches (1/2) *
> Integer.MAX_VALUE and a rehash is necessary as is evident from the following
> excerpt from rehash.? Notice that rehash will attempt to create an array of
> negative size if the size of the backing array reaches (1/2) *
> Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
> negative number.
>
> ??? int newCapacity = oldCapacity * 2 + 1;
> ??? HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>
> When the capacity of the backing array in a ByteArrayOutputStream reaches
> (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
> operation is invoked, the capacity of the backing array is increased only by
> the required number of elements.? Notice that in the following excerpt from
> ByteArrayOutputStream.write(int) the new backing array capacity is set to 2
> * buf.length unless this value would not suffice to accommodate the required
> capacity in which case it is set to the required capacity.? If the current
> backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
> buf.length << 1 overflows and resolves to a negative number resulting in the
> new capacity being set to the required capacity.? The major consequence of
> this, like with ArrayList, is that each subsequent write operation results
> in a full resize of the ByteArrayOutputStream causing performance to degrade
> significantly.
>
> ??? int newcount = count + 1;
> ??? if (newcount > buf.length) {
> ??????????? buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
> ??? }
>
> It is interesting to note that any statements about the amortized time
> complexity of add/insert operations, such as the one in the ArrayList
> javadoc, are invalidated by the performance related bugs.? One solution to
> the above situations is to set the new capacity of the backing array to
> Integer.MAX_VALUE when the initial size calculation results in a negative
> number during a resize.
>
> Apologies if these bugs are already known.
>
> Regards,
>
> Kevin
>


From develop4lasu at gmail.com  Fri Mar  5 09:47:39 2010
From: develop4lasu at gmail.com (develop4lasu at gmail.com)
Date: Fri, 05 Mar 2010 09:47:39 +0000
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
Message-ID: <0015174412fa29216404810a9b22@google.com>

Hello,

I'm using my own Collections if it's possible so I can add some thoughts:

1. I would decrease default array size to 4/6/8, for me it was few Mb more  
of free memory ( i suggest testing on application that use at least 300Mb)

I would test:

initial size: 4
long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;

initial size: 6
long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;

initial size: 8
long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;

initial size: 4
long newCapacity = ((long)oldCapacity) + (oldCapacity >> 2) + 4;

and then use:
> (int)Math.min(newCapacity, Integer.MAX_VALUE);


Would be nice then for:
Collections.addAll(...)
to ask for proper capacity before adding any elements

Greetings.

W dniu 05-03-2010 10:04 Martin Buchholz <martinrb at google.com> napisa?(a):
> Hi Kevin,


> As you've noticed, creating objects within a factor of two of

> their natural limits is a good way to expose lurking bugs.


> I'm the one responsible for the algorithm in ArrayList.

> I'ma bit embarrassed, looking at that code today.

> We could set the array size to Integer.MAX_VALUE,

> but then you might hit an independent buglet in hotspot

> that you cannot allocate an array with Integer.MAX_VALUE

> elements, but Integer.MAX_VALUE - 5 (or so) works.


> It occurs to me that increasing the size by 50% is better done by

> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;


> I agree with the plan of setting the capacity to something near

> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.


> These bugs are not known.

> Chris Hegarty, could you file a bug for us?


> Martin


> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern kevin.l.stern at gmail.com>  
> wrote:

> > Greetings,

> >

> > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and

> > java.io.ByteArrayOutputStream which arise when the capacities of the  
> data

> > structures reach a particular threshold. More below.

> >

> > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its  
> size

> > reaches its capacity and an add or an insert operation is invoked, the

> > capacity is increased by only one element. Notice that in the following

> > excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *

> > oldCapacity + 1 unless this value would not suffice to accommodate the

> > required capacity in which case it is set to the required capacity. If  
> the

> > current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity  
> *

> > 3)/2 + 1 overflows and resolves to a negative number resulting in the  
> new

> > capacity being set to the required capacity. The major consequence of  
> this

> > is that each subsequent add/insert operation results in a full resize  
> of the

> > ArrayList causing performance to degrade significantly.

> >

> > int newCapacity = (oldCapacity * 3)/2 + 1;

> > if (newCapacity
> > newCapacity = minCapacity;

> >

> > Hashtable breaks entirely when the size of its backing array reaches  
> (1/2) *

> > Integer.MAX_VALUE and a rehash is necessary as is evident from the  
> following

> > excerpt from rehash. Notice that rehash will attempt to create an array  
> of

> > negative size if the size of the backing array reaches (1/2) *

> > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a

> > negative number.

> >

> > int newCapacity = oldCapacity * 2 + 1;

> > HashtableEntry newTable[] = new HashtableEntry[newCapacity];

> >

> > When the capacity of the backing array in a ByteArrayOutputStream  
> reaches

> > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write

> > operation is invoked, the capacity of the backing array is increased  
> only by

> > the required number of elements. Notice that in the following excerpt  
> from

> > ByteArrayOutputStream.write(int) the new backing array capacity is set  
> to 2

> > * buf.length unless this value would not suffice to accommodate the  
> required

> > capacity in which case it is set to the required capacity. If the  
> current

> > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then

> > buf.length
> > new capacity being set to the required capacity. The major consequence  
> of

> > this, like with ArrayList, is that each subsequent write operation  
> results

> > in a full resize of the ByteArrayOutputStream causing performance to  
> degrade

> > significantly.

> >

> > int newcount = count + 1;

> > if (newcount > buf.length) {

> > buf = Arrays.copyOf(buf, Math.max(buf.length
> > }

> >

> > It is interesting to note that any statements about the amortized time

> > complexity of add/insert operations, such as the one in the ArrayList

> > javadoc, are invalidated by the performance related bugs. One solution  
> to

> > the above situations is to set the new capacity of the backing array to

> > Integer.MAX_VALUE when the initial size calculation results in a  
> negative

> > number during a resize.

> >

> > Apologies if these bugs are already known.

> >

> > Regards,

> >

> > Kevin

> >


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100305/833b39e4/attachment.html>

From Ulf.Zibis at gmx.de  Fri Mar  5 10:06:10 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 05 Mar 2010 11:06:10 +0100
Subject: Bugs in java.util.ArrayList,
	java.util.Hashtable and 	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
Message-ID: <4B90D792.8000207@gmx.de>

Am 05.03.2010 10:04, schrieb Martin Buchholz:
> Hi Kevin,
>
> As you've noticed, creating objects within a factor of two of
> their natural limits is a good way to expose lurking bugs.
>
> I'm the one responsible for the algorithm in ArrayList.
> I'm a bit embarrassed, looking at that code today.
> We could set the array size to Integer.MAX_VALUE,
> but then you might hit an independent buglet in hotspot
> that you cannot allocate an array with Integer.MAX_VALUE
> elements, but Integer.MAX_VALUE - 5 (or so) works.
>    

I think, using a max size of Integer.MAX_VALUE - x looks awful, in 
particular if it's badly commented in the sources.
I suggest to introduce something like 
System.MAX_COLLECTION_SIZE/CAPACITY or .maxCollectionSize/Capacity().

-Ulf


From kevin.l.stern at gmail.com  Fri Mar  5 10:39:10 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Fri, 5 Mar 2010 04:39:10 -0600
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <4B90D792.8000207@gmx.de>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<4B90D792.8000207@gmx.de>
Message-ID: <1704b7a21003050239i33b66297kf8d03e25836ee462@mail.gmail.com>

FYI, HashMap independently defines a MAXIMUM_CAPACITY variable; it might be
a good idea to retrofit this and other such local definitions with any
system wide variables that are defined.

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

Regards,

Kevin

On Fri, Mar 5, 2010 at 4:06 AM, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:

> Am 05.03.2010 10:04, schrieb Martin Buchholz:
>
>  Hi Kevin,
>>
>> As you've noticed, creating objects within a factor of two of
>> their natural limits is a good way to expose lurking bugs.
>>
>> I'm the one responsible for the algorithm in ArrayList.
>> I'm a bit embarrassed, looking at that code today.
>> We could set the array size to Integer.MAX_VALUE,
>> but then you might hit an independent buglet in hotspot
>> that you cannot allocate an array with Integer.MAX_VALUE
>> elements, but Integer.MAX_VALUE - 5 (or so) works.
>>
>>
>
> I think, using a max size of Integer.MAX_VALUE - x looks awful, in
> particular if it's badly commented in the sources.
> I suggest to introduce something like System.MAX_COLLECTION_SIZE/CAPACITY
> or .maxCollectionSize/Capacity().
>
> -Ulf
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100305/e9aae92a/attachment.html>

From kevin.l.stern at gmail.com  Fri Mar  5 10:48:59 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Fri, 5 Mar 2010 04:48:59 -0600
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
Message-ID: <1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>

Hi Martin,

Thank you for your reply.  If I may, PriorityQueue appears to employ the
simple strategy that I suggested above in its grow method:

        int newCapacity = ((oldCapacity < 64)?
                           ((oldCapacity + 1) * 2):
                           ((oldCapacity / 2) * 3));
        if (newCapacity < 0) // overflow
            newCapacity = Integer.MAX_VALUE;

It might be desirable to set a common strategy for capacity increase for all
collections.

Regards,

Kevin

On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <martinrb at google.com> wrote:

> Hi Kevin,
>
> As you've noticed, creating objects within a factor of two of
> their natural limits is a good way to expose lurking bugs.
>
> I'm the one responsible for the algorithm in ArrayList.
> I'm a bit embarrassed, looking at that code today.
> We could set the array size to Integer.MAX_VALUE,
> but then you might hit an independent buglet in hotspot
> that you cannot allocate an array with Integer.MAX_VALUE
> elements, but Integer.MAX_VALUE - 5 (or so) works.
>
> It occurs to me that increasing the size by 50% is better done by
> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
>
> I agree with the plan of setting the capacity to something near
> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>
> These bugs are not known.
> Chris Hegarty, could you file a bug for us?
>
> Martin
>
> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
> > Greetings,
> >
> > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
> > java.io.ByteArrayOutputStream which arise when the capacities of the data
> > structures reach a particular threshold.  More below.
> >
> > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
> size
> > reaches its capacity and an add or an insert operation is invoked, the
> > capacity is increased by only one element.  Notice that in the following
> > excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
> > oldCapacity + 1 unless this value would not suffice to accommodate the
> > required capacity in which case it is set to the required capacity.  If
> the
> > current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity *
> > 3)/2 + 1 overflows and resolves to a negative number resulting in the new
> > capacity being set to the required capacity.  The major consequence of
> this
> > is that each subsequent add/insert operation results in a full resize of
> the
> > ArrayList causing performance to degrade significantly.
> >
> >         int newCapacity = (oldCapacity * 3)/2 + 1;
> >             if (newCapacity < minCapacity)
> >         newCapacity = minCapacity;
> >
> > Hashtable breaks entirely when the size of its backing array reaches
> (1/2) *
> > Integer.MAX_VALUE and a rehash is necessary as is evident from the
> following
> > excerpt from rehash.  Notice that rehash will attempt to create an array
> of
> > negative size if the size of the backing array reaches (1/2) *
> > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
> > negative number.
> >
> >     int newCapacity = oldCapacity * 2 + 1;
> >     HashtableEntry newTable[] = new HashtableEntry[newCapacity];
> >
> > When the capacity of the backing array in a ByteArrayOutputStream reaches
> > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
> > operation is invoked, the capacity of the backing array is increased only
> by
> > the required number of elements.  Notice that in the following excerpt
> from
> > ByteArrayOutputStream.write(int) the new backing array capacity is set to
> 2
> > * buf.length unless this value would not suffice to accommodate the
> required
> > capacity in which case it is set to the required capacity.  If the
> current
> > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
> > buf.length << 1 overflows and resolves to a negative number resulting in
> the
> > new capacity being set to the required capacity.  The major consequence
> of
> > this, like with ArrayList, is that each subsequent write operation
> results
> > in a full resize of the ByteArrayOutputStream causing performance to
> degrade
> > significantly.
> >
> >     int newcount = count + 1;
> >     if (newcount > buf.length) {
> >             buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
> newcount));
> >     }
> >
> > It is interesting to note that any statements about the amortized time
> > complexity of add/insert operations, such as the one in the ArrayList
> > javadoc, are invalidated by the performance related bugs.  One solution
> to
> > the above situations is to set the new capacity of the backing array to
> > Integer.MAX_VALUE when the initial size calculation results in a negative
> > number during a resize.
> >
> > Apologies if these bugs are already known.
> >
> > Regards,
> >
> > Kevin
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100305/501c8cde/attachment.html>

From jonathan.gibbons at sun.com  Sat Mar  6 00:14:47 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Sat, 06 Mar 2010 00:14:47 +0000
Subject: hg: jdk7/tl/langtools: 2 new changesets
Message-ID: <20100306001501.DFE7143F6E@hg.openjdk.java.net>

Changeset: a23282f17d0b
Author:    jjg
Date:      2010-03-05 16:12 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/a23282f17d0b

6930108: IllegalArgumentException in AbstractDiagnosticFormatter for tools/javac/api/TestJavacTaskScanner.jav
Reviewed-by: darcy

! src/share/classes/com/sun/tools/javac/util/BasicDiagnosticFormatter.java
! test/tools/javac/api/TestJavacTaskScanner.java
+ test/tools/javac/api/TestResolveError.java

Changeset: a4f3b97c8028
Author:    jjg
Date:      2010-03-05 16:13 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/a4f3b97c8028

Merge


From Kelly.Ohair at Sun.COM  Sat Mar  6 17:18:17 2010
From: Kelly.Ohair at Sun.COM (Kelly O'Hair)
Date: Sat, 06 Mar 2010 09:18:17 -0800
Subject: Test failure java/nio/channels/Selector/OpRead.java
Message-ID: <4B928E59.1070005@sun.com>


Just to record the event...

TEST: java/nio/channels/Selector/OpRead.java

Failed on Fedora 9 32bit machine prt-x2200-1.sfbay, NOT using -samevm.

I'll file a bug  if it repeats, or you ask for one to be filed.

-kto

--------------------------------------------------
TEST: java/nio/channels/Selector/OpRead.java
JDK under test: (/tmp/jprt/P1/T/060322.ohair/testproduct/linux_i586_2.6-product)
java version "1.7.0-2010-03-06-060322.ohair.jdk"
Java(TM) SE Runtime Environment (build 1.7.0-2010-03-06-060322.ohair.jdk-jprtadm_2010_03_05_22_07-b00)
Java HotSpot(TM) Server VM (build 17.0-b10, mixed mode)

ACTION: build -- Passed. Build successful
REASON: Named class compiled on demand
TIME:   0.702 seconds
messages:
command: build OpRead
reason: Named class compiled on demand
elapsed time (seconds): 0.702

ACTION: compile -- Passed. Compilation successful
REASON: .class file out of date or does not exist
TIME:   0.702 seconds
messages:
command: compile /tmp/jprt/P1/T/060322.ohair/source/test/java/nio/channels/Selector/OpRead.java
reason: .class file out of date or does not exist
elapsed time (seconds): 0.702
STDOUT:
STDERR:

ACTION: main -- Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Test failed
REASON: Assumed action based on file name: run main OpRead
TIME:   1.146 seconds
messages:
command: main OpRead
reason: Assumed action based on file name: run main OpRead
elapsed time (seconds): 1.146
STDOUT:
STDERR:
java.lang.RuntimeException: Test failed
	at OpRead.test(OpRead.java:68)
	at OpRead.main(OpRead.java:83)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:613)
	at com.sun.javatest.regtest.MainWrapper$MainThread.run(MainWrapper.java:94)
	at java.lang.Thread.run(Thread.java:717)

JavaTest Message: Test threw exception: java.lang.RuntimeException: Test failed
JavaTest Message: shutting down test

STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Test failed

TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Test failed
--------------------------------------------------


From Alan.Bateman at Sun.COM  Sat Mar  6 17:32:52 2010
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Sat, 06 Mar 2010 17:32:52 +0000
Subject: Test failure java/nio/channels/Selector/OpRead.java
In-Reply-To: <4B928E59.1070005@sun.com>
References: <4B928E59.1070005@sun.com>
Message-ID: <4B9291C4.40205@sun.com>

Kelly O'Hair wrote:
>
> Just to record the event...
>
> TEST: java/nio/channels/Selector/OpRead.java
>
> Failed on Fedora 9 32bit machine prt-x2200-1.sfbay, NOT using -samevm.
>
> I'll file a bug  if it repeats, or you ask for one to be filed.
>
> -kto
Looking at it now, the test has a timing issue and I'm surprised we 
haven't seen this failure before. So yes, please create a bug.

-Alan.


From Ulf.Zibis at gmx.de  Sat Mar  6 21:00:19 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 06 Mar 2010 22:00:19 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4B8EB46C.1010208@sun.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com>
Message-ID: <4B92C263.9020404@gmx.de>

Very fast Sherman, much thanks.

Could you set the bug to accepted and evaluated, so my patch will have a 
chance to get into the code base?

-Ulf


Am 03.03.2010 20:11, schrieb Xueming Shen:
> #6931812
>
> Martin Buchholz wrote:
>> Sherman, would you like to file bugs for Ulf's improvements?
>>
>> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>
>>>> Keep in mind that supplementary characters are extremely rare.
>>>>
>>> Yes, but many API's in the JDK are used rarely.
>>> Why should they waste memory footprint / perform bad, particularly 
>>> if it
>>> doesn't cost anything.
>>
>> I admire your perfectionism.
>>
>>>> Therefore the existing implementation
>>>>
>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>> &&  codePoint<= MAX_CODE_POINT;
>>>>
>>>> will almost always perform just one comparison against a constant,
>>>> which is hard to beat.
>>>>
>>> 1. Wondering: I think there are TWO comparisons.
>>> 2. Those comparisons need to load 32 bit values from machine code, 
>>> against
>>> only 8 bit values in my case.
>>
>> It's a good point.  In the machine code, shifts are likely to use
>> immediate values, and so will be a small win.
>>
>> int x = codePoint >>> 16;
>> return x != 0 && x < 0x11;
>>
>> (On modern hardware, these optimizations
>> are less valuable than they used to be;
>> ordinary integer arithmetic is almost free)
>>
>> Martin
>
>


From kelly.ohair at sun.com  Sat Mar  6 23:00:11 2010
From: kelly.ohair at sun.com (kelly.ohair at sun.com)
Date: Sat, 06 Mar 2010 23:00:11 +0000
Subject: hg: jdk7/tl/jdk: 6915983: testing problems, adjusting list of tests, 
	needs some investigation
Message-ID: <20100306230106.5B8CA440AE@hg.openjdk.java.net>

Changeset: 58b44ac0b10d
Author:    ohair
Date:      2010-03-06 14:59 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/58b44ac0b10d

6915983: testing problems, adjusting list of tests, needs some investigation
Reviewed-by: alanb

! test/Makefile
! test/ProblemList.txt


From kelly.ohair at sun.com  Sat Mar  6 23:01:34 2010
From: kelly.ohair at sun.com (kelly.ohair at sun.com)
Date: Sat, 06 Mar 2010 23:01:34 +0000
Subject: hg: jdk7/tl: 2 new changesets
Message-ID: <20100306230134.7DB32440AF@hg.openjdk.java.net>

Changeset: 4d7419e4b759
Author:    ohair
Date:      2010-03-06 15:00 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/4d7419e4b759

6928700: Configure top repo for JPRT testing
Reviewed-by: alanb, jjg

! make/jprt.properties
+ test/Makefile

Changeset: f3664d6879ab
Author:    ohair
Date:      2010-03-06 15:01 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/f3664d6879ab

Merge


From martinrb at google.com  Tue Mar  9 02:10:37 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 8 Mar 2010 18:10:37 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
Message-ID: <1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>

[Chris or Alan, please review and file a bug]

OK, guys,

Here's a patch:

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/

Martin

On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> Hi Martin,
>
> Thank you for your reply.? If I may, PriorityQueue appears to employ the
> simple strategy that I suggested above in its grow method:
>
> ??????? int newCapacity = ((oldCapacity < 64)?
> ?????????????????????????? ((oldCapacity + 1) * 2):
> ?????????????????????????? ((oldCapacity / 2) * 3));
> ??????? if (newCapacity < 0) // overflow
> ??????????? newCapacity = Integer.MAX_VALUE;
>
> It might be desirable to set a common strategy for capacity increase for all
> collections.
>
> Regards,
>
> Kevin
>
> On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <martinrb at google.com> wrote:
>>
>> Hi Kevin,
>>
>> As you've noticed, creating objects within a factor of two of
>> their natural limits is a good way to expose lurking bugs.
>>
>> I'm the one responsible for the algorithm in ArrayList.
>> I'm a bit embarrassed, looking at that code today.
>> We could set the array size to Integer.MAX_VALUE,
>> but then you might hit an independent buglet in hotspot
>> that you cannot allocate an array with Integer.MAX_VALUE
>> elements, but Integer.MAX_VALUE - 5 (or so) works.
>>
>> It occurs to me that increasing the size by 50% is better done by
>> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
>>
>> I agree with the plan of setting the capacity to something near
>> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>>
>> These bugs are not known.
>> Chris Hegarty, could you file a bug for us?
>>
>> Martin
>>
>> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.stern at gmail.com>
>> wrote:
>> > Greetings,
>> >
>> > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
>> > java.io.ByteArrayOutputStream which arise when the capacities of the
>> > data
>> > structures reach a particular threshold.? More below.
>> >
>> > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
>> > size
>> > reaches its capacity and an add or an insert operation is invoked, the
>> > capacity is increased by only one element.? Notice that in the following
>> > excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
>> > oldCapacity + 1 unless this value would not suffice to accommodate the
>> > required capacity in which case it is set to the required capacity.? If
>> > the
>> > current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity
>> > *
>> > 3)/2 + 1 overflows and resolves to a negative number resulting in the
>> > new
>> > capacity being set to the required capacity.? The major consequence of
>> > this
>> > is that each subsequent add/insert operation results in a full resize of
>> > the
>> > ArrayList causing performance to degrade significantly.
>> >
>> > ??? ??? int newCapacity = (oldCapacity * 3)/2 + 1;
>> > ??? ??? ??? if (newCapacity < minCapacity)
>> > ??? ??? newCapacity = minCapacity;
>> >
>> > Hashtable breaks entirely when the size of its backing array reaches
>> > (1/2) *
>> > Integer.MAX_VALUE and a rehash is necessary as is evident from the
>> > following
>> > excerpt from rehash.? Notice that rehash will attempt to create an array
>> > of
>> > negative size if the size of the backing array reaches (1/2) *
>> > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
>> > negative number.
>> >
>> > ??? int newCapacity = oldCapacity * 2 + 1;
>> > ??? HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>> >
>> > When the capacity of the backing array in a ByteArrayOutputStream
>> > reaches
>> > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
>> > operation is invoked, the capacity of the backing array is increased
>> > only by
>> > the required number of elements.? Notice that in the following excerpt
>> > from
>> > ByteArrayOutputStream.write(int) the new backing array capacity is set
>> > to 2
>> > * buf.length unless this value would not suffice to accommodate the
>> > required
>> > capacity in which case it is set to the required capacity.? If the
>> > current
>> > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
>> > buf.length << 1 overflows and resolves to a negative number resulting in
>> > the
>> > new capacity being set to the required capacity.? The major consequence
>> > of
>> > this, like with ArrayList, is that each subsequent write operation
>> > results
>> > in a full resize of the ByteArrayOutputStream causing performance to
>> > degrade
>> > significantly.
>> >
>> > ??? int newcount = count + 1;
>> > ??? if (newcount > buf.length) {
>> > ??????????? buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
>> > newcount));
>> > ??? }
>> >
>> > It is interesting to note that any statements about the amortized time
>> > complexity of add/insert operations, such as the one in the ArrayList
>> > javadoc, are invalidated by the performance related bugs.? One solution
>> > to
>> > the above situations is to set the new capacity of the backing array to
>> > Integer.MAX_VALUE when the initial size calculation results in a
>> > negative
>> > number during a resize.
>> >
>> > Apologies if these bugs are already known.
>> >
>> > Regards,
>> >
>> > Kevin
>> >
>
>


From martinrb at google.com  Tue Mar  9 02:13:38 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 8 Mar 2010 18:13:38 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
Message-ID: <1ccfd1c11003081813o4ea436d2o2414182160e20d76@mail.gmail.com>

On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> Hi Martin,
>
> Thank you for your reply.? If I may, PriorityQueue appears to employ the
> simple strategy that I suggested above in its grow method:
>
> ??????? int newCapacity = ((oldCapacity < 64)?
> ?????????????????????????? ((oldCapacity + 1) * 2):
> ?????????????????????????? ((oldCapacity / 2) * 3));
> ??????? if (newCapacity < 0) // overflow
> ??????????? newCapacity = Integer.MAX_VALUE;
>
> It might be desirable to set a common strategy for capacity increase for all
> collections.

The PriorityQueue implementation is better than always doubling,
but not better enough to change the expansion policy of existing heavily used
collection classes.

Martin


From martinrb at google.com  Tue Mar  9 02:20:08 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 8 Mar 2010 18:20:08 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <0015174412fa29216404810a9b22@google.com>
References: <1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<0015174412fa29216404810a9b22@google.com>
Message-ID: <1ccfd1c11003081820o712ec40bk8f5ba446282cc5aa@mail.gmail.com>

2010/3/5  <develop4lasu at gmail.com>:
> Hello,
>
> I'm using my own Collections if it's possible so I can add some thoughts:
>
> 1. I would decrease default array size to 4/6/8, for me it was few Mb more
> of free memory ( i suggest testing on application that use at least 300Mb)
>
> I would test:
>
> initial size: 4
> long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;
>
> initial size: 6
> long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;
>
> initial size: 8
> long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;
>
> initial size: 4
> long newCapacity = ((long)oldCapacity) + (oldCapacity >> 2) + 4;

I agree that smaller initial sizes would be better,
(and better yet would be to eventually shrink sizes of arrays!)
but it's very hard to change the default behavior of classes
in the JDK.  Java benchmarks typically do not test
memory-constrained environments, so the JDK
usually optimizes for time over space.

This is the kind of optimization that might better go
into the less conservative IcedTea fork.

> and then use:
>> (int)Math.min(newCapacity, Integer.MAX_VALUE);

The above expression always yields newCapacity.

Martin


From martinrb at google.com  Tue Mar  9 02:37:52 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 8 Mar 2010 18:37:52 -0800
Subject: Support for PARTIAL_FLUSH in Deflater
Message-ID: <1ccfd1c11003081837i41e7640bvf239ad351d65c887@mail.gmail.com>

Hi FlaterMouses,

We added support for various "flush modes" to Deflater,
but we did not include support for PARTIAL_FLUSH.
Because not even zlib.h is enthusiastic about PARTIAL_FLUSH:

#define Z_PARTIAL_FLUSH 1 /* will be removed, use Z_SYNC_FLUSH instead */

But it sure looks like Z_PARTIAL_FLUSH will never actually be removed.
It's been a few years, and PARTIAL_FLUSH is actively used by SSH
implementations, as vaguely specified in RFC 4253.

Costin argued for PARTIAL_FLUSH elsewhere:

"""
The jzlib library was written exactly for this reason - it was not possible to
implement SSL using deflater, and with this change it still isn't possible.

The following compression methods are currently defined:

     none     REQUIRED        no compression
     zlib     OPTIONAL        ZLIB (LZ77) compression

  The "zlib" compression is described in [RFC1950] and in [RFC1951].
  The compression context is initialized after each key exchange, and
  is passed from one packet to the next, with only a partial flush
  being performed at the end of each packet.  A partial flush means
  that the current compressed block is ended and all data will be
  output.  If the current block is not a stored block, one or more
  empty blocks are added after the current block to ensure that there
  are at least 8 bits, counting from the start of the end-of-block code
  of the current block to the end of the packet payload.

So anyone implementing SSH with compression will have to use introspection or
alternate compression library.
"""

Since adding the missing support seems easier than arguing about it,
I suggest we Just Do It.

Martin


From dmytro_sheyko at hotmail.com  Tue Mar  9 10:19:08 2010
From: dmytro_sheyko at hotmail.com (Dmytro Sheyko)
Date: Tue, 9 Mar 2010 17:19:08 +0700
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and
	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>,
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>,
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>,
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
Message-ID: <SNT137-w17EDADB29A713F1A83FD6B8A340@phx.gbl>


Is there any reason to use comparison like this

if (newCapacity - minCapacity < 0)

if (newCapacity - MAX_ARRAY_SIZE > 0) {

instead of

if (newCapacity < minCapacity)

if (newCapacity > MAX_ARRAY_SIZE) {

Thanks,
Dmytro

> Date: Mon, 8 Mar 2010 18:10:37 -0800
> Subject: Re: Bugs in java.util.ArrayList, java.util.Hashtable and 	java.io.ByteArrayOutputStream
> From: martinrb at google.com
> To: kevin.l.stern at gmail.com; christopher.hegarty at sun.com; alan.bateman at sun.com
> CC: core-libs-dev at openjdk.java.net
> 
> [Chris or Alan, please review and file a bug]
> 
> OK, guys,
> 
> Here's a patch:
> 
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
> 
> Martin

 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with powerful SPAM protection.
https://signup.live.com/signup.aspx?id=60969
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100309/b1ec450e/attachment.html>

From kevin.l.stern at gmail.com  Tue Mar  9 10:38:17 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Tue, 9 Mar 2010 04:38:17 -0600
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <SNT137-w17EDADB29A713F1A83FD6B8A340@phx.gbl>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<SNT137-w17EDADB29A713F1A83FD6B8A340@phx.gbl>
Message-ID: <1704b7a21003090238j74fcf04fs54e4d1aedbfa8a20@mail.gmail.com>

These comparisons are essential to the working of Martin's algorithm.  I
found them interesting as well, but notice that when the capacity overflows
these comparisons will always be false.  That is to say:

oldCapacity < minCapacity (given, otherwise we would not be resizing)
therefore oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity -
minCapacity < oldCapacity

So if oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity >
Integer.MAX_VALUE, subtracting minCapacity re-overflows back into the
positive number realm.

That being said, and this is a question/comment to all, I want to point out
that this type of code assumes a particular class of orderly overflow
behavior.  Is this specified in the Java spec, or will this break on an
obscure machine that does not use, say, two's complement arithmetic?

Regards,

Kevin

2010/3/9 Dmytro Sheyko <dmytro_sheyko at hotmail.com>

>  Is there any reason to use comparison like this
>
> if (newCapacity - minCapacity < 0)
>
> if (newCapacity - MAX_ARRAY_SIZE > 0) {
>
> instead of
>
> if (newCapacity < minCapacity)
>
> if (newCapacity > MAX_ARRAY_SIZE) {
>
> Thanks,
> Dmytro
>
> > Date: Mon, 8 Mar 2010 18:10:37 -0800
> > Subject: Re: Bugs in java.util.ArrayList, java.util.Hashtable and
> java.io.ByteArrayOutputStream
> > From: martinrb at google.com
> > To: kevin.l.stern at gmail.com; christopher.hegarty at sun.com;
> alan.bateman at sun.com
> > CC: core-libs-dev at openjdk.java.net
>
> >
> > [Chris or Alan, please review and file a bug]
> >
> > OK, guys,
> >
> > Here's a patch:
> >
> > http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/<http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk7/ArrayResize/>
> >
> > Martin
>
>
> ------------------------------
> Hotmail: Trusted email with powerful SPAM protection. Sign up now.<https://signup.live.com/signup.aspx?id=60969>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100309/ec479e8c/attachment.html>

From kevin.l.stern at gmail.com  Tue Mar  9 11:02:21 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Tue, 9 Mar 2010 05:02:21 -0600
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1704b7a21003090238j74fcf04fs54e4d1aedbfa8a20@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<SNT137-w17EDADB29A713F1A83FD6B8A340@phx.gbl>
	<1704b7a21003090238j74fcf04fs54e4d1aedbfa8a20@mail.gmail.com>
Message-ID: <1704b7a21003090302p5bd875f3wbc7bf131296cb8dd@mail.gmail.com>

I did a quick search and it appears that Java is indeed two's complement
based.  Nonetheless, please allow me to point out that, in general, this
type of code worries me since I fully expect that at some point someone will
come along and do exactly what Dmytro suggested; that is, someone will
change:

if (a - b > 0)

to

if (a > b)

and the entire ship will sink.  I, personally, like to avoid obscurities
such as making integer overflow an essential basis for my algorithm unless
there is a good reason to do so.  I would, in general, prefer to avoid
overflow altogether and to make the overflow scenario more explicit:

if (oldCapacity > RESIZE_OVERFLOW_THRESHOLD) {
   // do something
} else {
  // do something else
}

Of course, these are simply my coding preferences and I may very well be
missing the 'good reason' to take the current approach.

Regards,

Kevin

On Tue, Mar 9, 2010 at 4:38 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:

> These comparisons are essential to the working of Martin's algorithm.  I
> found them interesting as well, but notice that when the capacity overflows
> these comparisons will always be false.  That is to say:
>
> oldCapacity < minCapacity (given, otherwise we would not be resizing)
> therefore oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity -
> minCapacity < oldCapacity
>
> So if oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity >
> Integer.MAX_VALUE, subtracting minCapacity re-overflows back into the
> positive number realm.
>
> That being said, and this is a question/comment to all, I want to point out
> that this type of code assumes a particular class of orderly overflow
> behavior.  Is this specified in the Java spec, or will this break on an
> obscure machine that does not use, say, two's complement arithmetic?
>
> Regards,
>
> Kevin
>
> 2010/3/9 Dmytro Sheyko <dmytro_sheyko at hotmail.com>
>
>  Is there any reason to use comparison like this
>>
>> if (newCapacity - minCapacity < 0)
>>
>> if (newCapacity - MAX_ARRAY_SIZE > 0) {
>>
>> instead of
>>
>> if (newCapacity < minCapacity)
>>
>> if (newCapacity > MAX_ARRAY_SIZE) {
>>
>> Thanks,
>> Dmytro
>>
>> > Date: Mon, 8 Mar 2010 18:10:37 -0800
>> > Subject: Re: Bugs in java.util.ArrayList, java.util.Hashtable and
>> java.io.ByteArrayOutputStream
>> > From: martinrb at google.com
>> > To: kevin.l.stern at gmail.com; christopher.hegarty at sun.com;
>> alan.bateman at sun.com
>> > CC: core-libs-dev at openjdk.java.net
>>
>> >
>> > [Chris or Alan, please review and file a bug]
>> >
>> > OK, guys,
>> >
>> > Here's a patch:
>> >
>> > http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/<http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk7/ArrayResize/>
>> >
>> > Martin
>>
>>
>> ------------------------------
>> Hotmail: Trusted email with powerful SPAM protection. Sign up now.<https://signup.live.com/signup.aspx?id=60969>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100309/d8bc59bf/attachment.html>

From Christopher.Hegarty at Sun.COM  Tue Mar  9 11:41:30 2010
From: Christopher.Hegarty at Sun.COM (Christopher Hegarty -Sun Microsystems Ireland)
Date: Tue, 09 Mar 2010 11:41:30 +0000
Subject: Bugs in java.util.ArrayList,
	java.util.Hashtable and	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
Message-ID: <4B9633EA.8070101@sun.com>

Sorry Martin, I appear to have missed your original request to file this 
bug. I since filed the following:

  6933217: Huge arrays handled poorly in core libraries

The changes you are proposing seem reasonable to me.

-Chris.

Martin Buchholz wrote:
> [Chris or Alan, please review and file a bug]
> 
> OK, guys,
> 
> Here's a patch:
> 
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
> 
> Martin
> 
> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
>> Hi Martin,
>>
>> Thank you for your reply.  If I may, PriorityQueue appears to employ the
>> simple strategy that I suggested above in its grow method:
>>
>>         int newCapacity = ((oldCapacity < 64)?
>>                            ((oldCapacity + 1) * 2):
>>                            ((oldCapacity / 2) * 3));
>>         if (newCapacity < 0) // overflow
>>             newCapacity = Integer.MAX_VALUE;
>>
>> It might be desirable to set a common strategy for capacity increase for all
>> collections.
>>
>> Regards,
>>
>> Kevin
>>
>> On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <martinrb at google.com> wrote:
>>> Hi Kevin,
>>>
>>> As you've noticed, creating objects within a factor of two of
>>> their natural limits is a good way to expose lurking bugs.
>>>
>>> I'm the one responsible for the algorithm in ArrayList.
>>> I'm a bit embarrassed, looking at that code today.
>>> We could set the array size to Integer.MAX_VALUE,
>>> but then you might hit an independent buglet in hotspot
>>> that you cannot allocate an array with Integer.MAX_VALUE
>>> elements, but Integer.MAX_VALUE - 5 (or so) works.
>>>
>>> It occurs to me that increasing the size by 50% is better done by
>>> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
>>>
>>> I agree with the plan of setting the capacity to something near
>>> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>>>
>>> These bugs are not known.
>>> Chris Hegarty, could you file a bug for us?
>>>
>>> Martin
>>>
>>> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.stern at gmail.com>
>>> wrote:
>>>> Greetings,
>>>>
>>>> I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
>>>> java.io.ByteArrayOutputStream which arise when the capacities of the
>>>> data
>>>> structures reach a particular threshold.  More below.
>>>>
>>>> When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
>>>> size
>>>> reaches its capacity and an add or an insert operation is invoked, the
>>>> capacity is increased by only one element.  Notice that in the following
>>>> excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
>>>> oldCapacity + 1 unless this value would not suffice to accommodate the
>>>> required capacity in which case it is set to the required capacity.  If
>>>> the
>>>> current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity
>>>> *
>>>> 3)/2 + 1 overflows and resolves to a negative number resulting in the
>>>> new
>>>> capacity being set to the required capacity.  The major consequence of
>>>> this
>>>> is that each subsequent add/insert operation results in a full resize of
>>>> the
>>>> ArrayList causing performance to degrade significantly.
>>>>
>>>>         int newCapacity = (oldCapacity * 3)/2 + 1;
>>>>             if (newCapacity < minCapacity)
>>>>         newCapacity = minCapacity;
>>>>
>>>> Hashtable breaks entirely when the size of its backing array reaches
>>>> (1/2) *
>>>> Integer.MAX_VALUE and a rehash is necessary as is evident from the
>>>> following
>>>> excerpt from rehash.  Notice that rehash will attempt to create an array
>>>> of
>>>> negative size if the size of the backing array reaches (1/2) *
>>>> Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
>>>> negative number.
>>>>
>>>>     int newCapacity = oldCapacity * 2 + 1;
>>>>     HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>>>>
>>>> When the capacity of the backing array in a ByteArrayOutputStream
>>>> reaches
>>>> (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
>>>> operation is invoked, the capacity of the backing array is increased
>>>> only by
>>>> the required number of elements.  Notice that in the following excerpt
>>>> from
>>>> ByteArrayOutputStream.write(int) the new backing array capacity is set
>>>> to 2
>>>> * buf.length unless this value would not suffice to accommodate the
>>>> required
>>>> capacity in which case it is set to the required capacity.  If the
>>>> current
>>>> backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
>>>> buf.length << 1 overflows and resolves to a negative number resulting in
>>>> the
>>>> new capacity being set to the required capacity.  The major consequence
>>>> of
>>>> this, like with ArrayList, is that each subsequent write operation
>>>> results
>>>> in a full resize of the ByteArrayOutputStream causing performance to
>>>> degrade
>>>> significantly.
>>>>
>>>>     int newcount = count + 1;
>>>>     if (newcount > buf.length) {
>>>>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
>>>> newcount));
>>>>     }
>>>>
>>>> It is interesting to note that any statements about the amortized time
>>>> complexity of add/insert operations, such as the one in the ArrayList
>>>> javadoc, are invalidated by the performance related bugs.  One solution
>>>> to
>>>> the above situations is to set the new capacity of the backing array to
>>>> Integer.MAX_VALUE when the initial size calculation results in a
>>>> negative
>>>> number during a resize.
>>>>
>>>> Apologies if these bugs are already known.
>>>>
>>>> Regards,
>>>>
>>>> Kevin
>>>>
>>


From Ulf.Zibis at gmx.de  Tue Mar  9 11:59:26 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 09 Mar 2010 12:59:26 +0100
Subject: Bugs in java.util.ArrayList,
	java.util.Hashtable and 	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
Message-ID: <4B96381E.6030902@gmx.de>

In PriorityQueue:

let's result newCapacity in 0xFFFF.FFFC  =-4
then "if (newCapacity - MAX_ARRAY_SIZE > 0)" ---> false
then Arrays.copyOf(queue, newCapacity) ---> ArrayIndexOutOfBoundsException

Am I wrong ?

2.) Why don't you prefer a system-wide constant for MAX_ARRAY_SIZE ???

-Ulf


Am 09.03.2010 03:10, schrieb Martin Buchholz:
> [Chris or Alan, please review and file a bug]
>
> OK, guys,
>
> Here's a patch:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>
> Martin
>
> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern<kevin.l.stern at gmail.com>  wrote:
>    
>> Hi Martin,
>>
>> Thank you for your reply.  If I may, PriorityQueue appears to employ the
>> simple strategy that I suggested above in its grow method:
>>
>>          int newCapacity = ((oldCapacity<  64)?
>>                             ((oldCapacity + 1) * 2):
>>                             ((oldCapacity / 2) * 3));
>>          if (newCapacity<  0) // overflow
>>              newCapacity = Integer.MAX_VALUE;
>>
>> It might be desirable to set a common strategy for capacity increase for all
>> collections.
>>
>> Regards,
>>
>> Kevin
>>
>> On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz<martinrb at google.com>  wrote:
>>      
>>> Hi Kevin,
>>>
>>> As you've noticed, creating objects within a factor of two of
>>> their natural limits is a good way to expose lurking bugs.
>>>
>>> I'm the one responsible for the algorithm in ArrayList.
>>> I'm a bit embarrassed, looking at that code today.
>>> We could set the array size to Integer.MAX_VALUE,
>>> but then you might hit an independent buglet in hotspot
>>> that you cannot allocate an array with Integer.MAX_VALUE
>>> elements, but Integer.MAX_VALUE - 5 (or so) works.
>>>
>>> It occurs to me that increasing the size by 50% is better done by
>>> int newCapacity = oldCapacity + (oldCapacity>>  1) + 1;
>>>
>>> I agree with the plan of setting the capacity to something near
>>> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>>>
>>> These bugs are not known.
>>> Chris Hegarty, could you file a bug for us?
>>>
>>> Martin
>>>
>>> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern<kevin.l.stern at gmail.com>
>>> wrote:
>>>        
>>>> Greetings,
>>>>
>>>> I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
>>>> java.io.ByteArrayOutputStream which arise when the capacities of the
>>>> data
>>>> structures reach a particular threshold.  More below.
>>>>
>>>> When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
>>>> size
>>>> reaches its capacity and an add or an insert operation is invoked, the
>>>> capacity is increased by only one element.  Notice that in the following
>>>> excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
>>>> oldCapacity + 1 unless this value would not suffice to accommodate the
>>>> required capacity in which case it is set to the required capacity.  If
>>>> the
>>>> current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity
>>>> *
>>>> 3)/2 + 1 overflows and resolves to a negative number resulting in the
>>>> new
>>>> capacity being set to the required capacity.  The major consequence of
>>>> this
>>>> is that each subsequent add/insert operation results in a full resize of
>>>> the
>>>> ArrayList causing performance to degrade significantly.
>>>>
>>>>          int newCapacity = (oldCapacity * 3)/2 + 1;
>>>>              if (newCapacity<  minCapacity)
>>>>          newCapacity = minCapacity;
>>>>
>>>> Hashtable breaks entirely when the size of its backing array reaches
>>>> (1/2) *
>>>> Integer.MAX_VALUE and a rehash is necessary as is evident from the
>>>> following
>>>> excerpt from rehash.  Notice that rehash will attempt to create an array
>>>> of
>>>> negative size if the size of the backing array reaches (1/2) *
>>>> Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
>>>> negative number.
>>>>
>>>>      int newCapacity = oldCapacity * 2 + 1;
>>>>      HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>>>>
>>>> When the capacity of the backing array in a ByteArrayOutputStream
>>>> reaches
>>>> (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
>>>> operation is invoked, the capacity of the backing array is increased
>>>> only by
>>>> the required number of elements.  Notice that in the following excerpt
>>>> from
>>>> ByteArrayOutputStream.write(int) the new backing array capacity is set
>>>> to 2
>>>> * buf.length unless this value would not suffice to accommodate the
>>>> required
>>>> capacity in which case it is set to the required capacity.  If the
>>>> current
>>>> backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
>>>> buf.length<<  1 overflows and resolves to a negative number resulting in
>>>> the
>>>> new capacity being set to the required capacity.  The major consequence
>>>> of
>>>> this, like with ArrayList, is that each subsequent write operation
>>>> results
>>>> in a full resize of the ByteArrayOutputStream causing performance to
>>>> degrade
>>>> significantly.
>>>>
>>>>      int newcount = count + 1;
>>>>      if (newcount>  buf.length) {
>>>>              buf = Arrays.copyOf(buf, Math.max(buf.length<<  1,
>>>> newcount));
>>>>      }
>>>>
>>>> It is interesting to note that any statements about the amortized time
>>>> complexity of add/insert operations, such as the one in the ArrayList
>>>> javadoc, are invalidated by the performance related bugs.  One solution
>>>> to
>>>> the above situations is to set the new capacity of the backing array to
>>>> Integer.MAX_VALUE when the initial size calculation results in a
>>>> negative
>>>> number during a resize.
>>>>
>>>> Apologies if these bugs are already known.
>>>>
>>>> Regards,
>>>>
>>>> Kevin
>>>>
>>>>          
>>
>>      
>
>    


From kevin.l.stern at gmail.com  Tue Mar  9 12:04:40 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Tue, 9 Mar 2010 06:04:40 -0600
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1704b7a21003090302p5bd875f3wbc7bf131296cb8dd@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<SNT137-w17EDADB29A713F1A83FD6B8A340@phx.gbl>
	<1704b7a21003090238j74fcf04fs54e4d1aedbfa8a20@mail.gmail.com>
	<1704b7a21003090302p5bd875f3wbc7bf131296cb8dd@mail.gmail.com>
Message-ID: <1704b7a21003090404o1a3a3e56x4aa1b89c88466c28@mail.gmail.com>

Please excuse me - Martin is saving an 'if' statement in the vast majority
of scenarios since, presumably, the overflow scenario occurs very
infrequently (given that the bug has been in place for quite awhile).

On Tue, Mar 9, 2010 at 5:02 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:

> I did a quick search and it appears that Java is indeed two's complement
> based.  Nonetheless, please allow me to point out that, in general, this
> type of code worries me since I fully expect that at some point someone will
> come along and do exactly what Dmytro suggested; that is, someone will
> change:
>
> if (a - b > 0)
>
> to
>
> if (a > b)
>
> and the entire ship will sink.  I, personally, like to avoid obscurities
> such as making integer overflow an essential basis for my algorithm unless
> there is a good reason to do so.  I would, in general, prefer to avoid
> overflow altogether and to make the overflow scenario more explicit:
>
> if (oldCapacity > RESIZE_OVERFLOW_THRESHOLD) {
>    // do something
> } else {
>   // do something else
> }
>
> Of course, these are simply my coding preferences and I may very well be
> missing the 'good reason' to take the current approach.
>
> Regards,
>
> Kevin
>
>
> On Tue, Mar 9, 2010 at 4:38 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:
>
>> These comparisons are essential to the working of Martin's algorithm.  I
>> found them interesting as well, but notice that when the capacity overflows
>> these comparisons will always be false.  That is to say:
>>
>> oldCapacity < minCapacity (given, otherwise we would not be resizing)
>> therefore oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity -
>> minCapacity < oldCapacity
>>
>> So if oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity >
>> Integer.MAX_VALUE, subtracting minCapacity re-overflows back into the
>> positive number realm.
>>
>> That being said, and this is a question/comment to all, I want to point
>> out that this type of code assumes a particular class of orderly overflow
>> behavior.  Is this specified in the Java spec, or will this break on an
>> obscure machine that does not use, say, two's complement arithmetic?
>>
>> Regards,
>>
>> Kevin
>>
>> 2010/3/9 Dmytro Sheyko <dmytro_sheyko at hotmail.com>
>>
>>  Is there any reason to use comparison like this
>>>
>>> if (newCapacity - minCapacity < 0)
>>>
>>> if (newCapacity - MAX_ARRAY_SIZE > 0) {
>>>
>>> instead of
>>>
>>> if (newCapacity < minCapacity)
>>>
>>> if (newCapacity > MAX_ARRAY_SIZE) {
>>>
>>> Thanks,
>>> Dmytro
>>>
>>> > Date: Mon, 8 Mar 2010 18:10:37 -0800
>>> > Subject: Re: Bugs in java.util.ArrayList, java.util.Hashtable and
>>> java.io.ByteArrayOutputStream
>>> > From: martinrb at google.com
>>> > To: kevin.l.stern at gmail.com; christopher.hegarty at sun.com;
>>> alan.bateman at sun.com
>>> > CC: core-libs-dev at openjdk.java.net
>>>
>>> >
>>> > [Chris or Alan, please review and file a bug]
>>> >
>>> > OK, guys,
>>> >
>>> > Here's a patch:
>>> >
>>> > http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/<http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk7/ArrayResize/>
>>> >
>>> > Martin
>>>
>>>
>>> ------------------------------
>>> Hotmail: Trusted email with powerful SPAM protection. Sign up now.<https://signup.live.com/signup.aspx?id=60969>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100309/e9e3b8a9/attachment.html>

From Ulf.Zibis at gmx.de  Tue Mar  9 15:44:22 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 09 Mar 2010 16:44:22 +0100
Subject: Bugs in java.util.ArrayList,
	java.util.Hashtable and 	java.io.ByteArrayOutputStream
In-Reply-To: <1704b7a21003090302p5bd875f3wbc7bf131296cb8dd@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>	<SNT137-w17EDADB29A713F1A83FD6B8A340@phx.gbl>	<1704b7a21003090238j74fcf04fs54e4d1aedbfa8a20@mail.gmail.com>
	<1704b7a21003090302p5bd875f3wbc7bf131296cb8dd@mail.gmail.com>
Message-ID: <4B966CD6.1020502@gmx.de>

Am 09.03.2010 12:02, schrieb Kevin L. Stern:
> I did a quick search and it appears that Java is indeed two's 
> complement based.  Nonetheless, please allow me to point out that, in 
> general, this type of code worries me since I fully expect that at 
> some point someone will come along and do exactly what Dmytro 
> suggested; that is, someone will change:
>
> if (a - b > 0)
>
> to
>
> if (a > b)
>
> and the entire ship will sink.  I, personally, like to avoid 
> obscurities such as making integer overflow an essential basis for my 
> algorithm unless there is a good reason to do so.  I would, in 
> general, prefer to avoid overflow altogether and to make the overflow 
> scenario more explicit:

+1

I think those optimizations should be done by HotSpot.

-Ulf


From martinrb at google.com  Tue Mar  9 19:18:32 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 9 Mar 2010 11:18:32 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <4B96381E.6030902@gmx.de>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<4B96381E.6030902@gmx.de>
Message-ID: <1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>

On Tue, Mar 9, 2010 at 03:59, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> In PriorityQueue:
>
> let's result newCapacity in 0xFFFF.FFFC ?=-4
> then "if (newCapacity - MAX_ARRAY_SIZE > 0)" ---> false
> then Arrays.copyOf(queue, newCapacity) ---> ArrayIndexOutOfBoundsException

How could newCapacity ever become -4?
Since growth is by 50%.  But even 100% looks safe...

> Am I wrong ?
>
> 2.) Why don't you prefer a system-wide constant for MAX_ARRAY_SIZE ???

This should never become a public API - it's a bug in the VM.

I prefer the duplication of code to creating a new external dependency.

Martin

> -Ulf


From opinali at gmail.com  Tue Mar  9 20:02:23 2010
From: opinali at gmail.com (Osvaldo Doederlein)
Date: Tue, 9 Mar 2010 17:02:23 -0300
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<4B96381E.6030902@gmx.de>
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
Message-ID: <fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>

Should we really consider this a VM bug? I'm not sure that it's a good idea
to allocate a single object which size exceeds 4Gb (for a byte[] - due to
the object header and array size field) - even on a 64-bit VM. An array with
2^32 elements is impossible, the maximum allowed by the size field is 2^32-1
which will be just as bad as 2^32-N for any other tiny positive N, for
algorithms that love arrays of [base-2-] "round" sizes.

And then if this bug is fixed, it may have slightly different variations.
For a long[] or double[] array, the allocation for the maximum size would
exceed 32Gb, so it exceeds the maximum heap size for 64-bit HotSpot with
CompressedOops. (Ok this is an artificial issue because we won't like have a
100% free heap, so the only impediment for "new long[2^32-1]" would be the
array header.)

My suggestion: impose some fixed N (maybe 64, or 0x100, ...), limiting
arrays to 2^32-N (for ANY element type). The artificial restriction should
be large enough to fit the object header of any vendor's JVM, plus the
per-object overhead of any reasonable heap structure. This limit could be
added to the spec, so the implementation is not a bug anymore :) and it
would be a portable limit. Otherwise, some app may work reliably on HotSpot
if it relies on the fact that 2^32-5 positions are possible, but may break
on some other vendor's JVM where perhaps the implementation limit is 2^32-13
or something else.

A+
Osvaldo

2010/3/9 Martin Buchholz <martinrb at google.com>

> On Tue, Mar 9, 2010 at 03:59, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> > In PriorityQueue:
> >
> > let's result newCapacity in 0xFFFF.FFFC  =-4
> > then "if (newCapacity - MAX_ARRAY_SIZE > 0)" ---> false
> > then Arrays.copyOf(queue, newCapacity) --->
> ArrayIndexOutOfBoundsException
>
> How could newCapacity ever become -4?
> Since growth is by 50%.  But even 100% looks safe...
>
> > Am I wrong ?
> >
> > 2.) Why don't you prefer a system-wide constant for MAX_ARRAY_SIZE ???
>
> This should never become a public API - it's a bug in the VM.
>
> I prefer the duplication of code to creating a new external dependency.
>
> Martin
>
> > -Ulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100309/b7792731/attachment.html>

From martinrb at google.com  Tue Mar  9 20:04:06 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 9 Mar 2010 12:04:06 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1704b7a21003090302p5bd875f3wbc7bf131296cb8dd@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<SNT137-w17EDADB29A713F1A83FD6B8A340@phx.gbl>
	<1704b7a21003090238j74fcf04fs54e4d1aedbfa8a20@mail.gmail.com>
	<1704b7a21003090302p5bd875f3wbc7bf131296cb8dd@mail.gmail.com>
Message-ID: <1ccfd1c11003091204q4d9e43a5g43fd8454059d1c88@mail.gmail.com>

On Tue, Mar 9, 2010 at 03:02, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> I did a quick search and it appears that Java is indeed two's complement
> based.? Nonetheless, please allow me to point out that, in general, this
> type of code worries me since I fully expect that at some point someone will
> come along and do exactly what Dmytro suggested; that is, someone will
> change:
>
> if (a - b > 0)
>
> to
>
> if (a > b)
>
> and the entire ship will sink.? I, personally, like to avoid obscurities
> such as making integer overflow an essential basis for my algorithm unless
> there is a good reason to do so.? I would, in general, prefer to avoid
> overflow altogether and to make the overflow scenario more explicit:
>
> if (oldCapacity > RESIZE_OVERFLOW_THRESHOLD) {
> ?? // do something
> } else {
> ? // do something else
> }

It's a good point.

In ArrayList we cannot do this (or at least not compatibly)
because ensureCapacity is a public API and effectively already
accepts negative numbers as requests for a positive capacity
that cannot be satisfied.

The current API is used like this:

        int newcount = count + len;
        ensureCapacity(newcount);

If you want to avoid overflow, you would need to change
to something less natural like

        ensureCapacity(count, len);
        int newcount = count + len;

Anyways, I'm keeping the overflow-conscious code,
but adding more warning comments,
and "out-lining" huge array creation so that
ArrayList's code now looks like:


    /**
     * Increases the capacity of this <tt>ArrayList</tt> instance, if
     * necessary, to ensure that it can hold at least the number of elements
     * specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    public void ensureCapacity(int minCapacity) {
        modCount++;
        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

    /**
     * The maximum size of array to allocate.
     * Some VMs reserve some header words in an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     */
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

    private int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

Webrev regenerated.

Martin

> Of course, these are simply my coding preferences and I may very well be
> missing the 'good reason' to take the current approach.
>
> Regards,
>
> Kevin
>
> On Tue, Mar 9, 2010 at 4:38 AM, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
>>
>> These comparisons are essential to the working of Martin's algorithm.? I
>> found them interesting as well, but notice that when the capacity overflows
>> these comparisons will always be false.? That is to say:
>>
>> oldCapacity < minCapacity (given, otherwise we would not be resizing)
>> therefore oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity -
>> minCapacity < oldCapacity
>>
>> So if oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity >
>> Integer.MAX_VALUE, subtracting minCapacity re-overflows back into the
>> positive number realm.
>>
>> That being said, and this is a question/comment to all, I want to point
>> out that this type of code assumes a particular class of orderly overflow
>> behavior.? Is this specified in the Java spec, or will this break on an
>> obscure machine that does not use, say, two's complement arithmetic?
>>
>> Regards,
>>
>> Kevin
>>
>> 2010/3/9 Dmytro Sheyko <dmytro_sheyko at hotmail.com>
>>>
>>> Is there any reason to use comparison like this
>>>
>>> if (newCapacity - minCapacity < 0)
>>>
>>> if (newCapacity - MAX_ARRAY_SIZE > 0) {
>>>
>>> instead of
>>>
>>> if (newCapacity < minCapacity)
>>>
>>> if (newCapacity > MAX_ARRAY_SIZE) {
>>>
>>> Thanks,
>>> Dmytro
>>>
>>> > Date: Mon, 8 Mar 2010 18:10:37 -0800
>>> > Subject: Re: Bugs in java.util.ArrayList, java.util.Hashtable and
>>> > java.io.ByteArrayOutputStream
>>> > From: martinrb at google.com
>>> > To: kevin.l.stern at gmail.com; christopher.hegarty at sun.com;
>>> > alan.bateman at sun.com
>>> > CC: core-libs-dev at openjdk.java.net
>>> >
>>> > [Chris or Alan, please review and file a bug]
>>> >
>>> > OK, guys,
>>> >
>>> > Here's a patch:
>>> >
>>> > http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>>> >
>>> > Martin
>>>
>>>
>>> ________________________________
>>> Hotmail: Trusted email with powerful SPAM protection. Sign up now.
>
>


From martinrb at google.com  Tue Mar  9 20:15:41 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 9 Mar 2010 12:15:41 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<4B96381E.6030902@gmx.de>
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
	<fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>
Message-ID: <1ccfd1c11003091215x7cc353calf8b03579a86a58f9@mail.gmail.com>

It surely is not a good idea to use a single backing array
for huge arrays.  As you point out, it's up to 32GB
for just one object.  But the core JDK
doesn't offer a suitable alternative for users who need very
large collections.

It would have been more in the spirit of Java to have a
collection class instead of ArrayList that was not fastest at
any particular operation, but had excellent asymptotic behaviour,
based on backing arrays containing backing arrays.
But:
- no such excellent class has been written yet
  (or please point me to such a class)
- even if it were, such a best-of-breed-general-purpose
  List implementation would probably need to be introduced as a
  separate class, because of the performance expectations of
  existing implementations.

In the meantime, we have to maintain what we got,
and that includes living with arrays and classes that wrap them.

Changing the spec is unlikely to succeed..

Martin

On Tue, Mar 9, 2010 at 12:02, Osvaldo Doederlein <opinali at gmail.com> wrote:
> Should we really consider this a VM bug? I'm not sure that it's a good idea
> to allocate a single object which size exceeds 4Gb (for a byte[] - due to
> the object header and array size field) - even on a 64-bit VM. An array with
> 2^32 elements is impossible, the maximum allowed by the size field is 2^32-1
> which will be just as bad as 2^32-N for any other tiny positive N, for
> algorithms that love arrays of [base-2-] "round" sizes.
>
> And then if this bug is fixed, it may have slightly different variations.
> For a long[] or double[] array, the allocation for the maximum size would
> exceed 32Gb, so it exceeds the maximum heap size for 64-bit HotSpot with
> CompressedOops. (Ok this is an artificial issue because we won't like have a
> 100% free heap, so the only impediment for "new long[2^32-1]" would be the
> array header.)
>
> My suggestion: impose some fixed N (maybe 64, or 0x100, ...), limiting
> arrays to 2^32-N (for ANY element type). The artificial restriction should
> be large enough to fit the object header of any vendor's JVM, plus the
> per-object overhead of any reasonable heap structure. This limit could be
> added to the spec, so the implementation is not a bug anymore :) and it
> would be a portable limit. Otherwise, some app may work reliably on HotSpot
> if it relies on the fact that 2^32-5 positions are possible, but may break
> on some other vendor's JVM where perhaps the implementation limit is 2^32-13
> or something else.
>
> A+
> Osvaldo
>
> 2010/3/9 Martin Buchholz <martinrb at google.com>
>>
>> On Tue, Mar 9, 2010 at 03:59, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>> > In PriorityQueue:
>> >
>> > let's result newCapacity in 0xFFFF.FFFC ?=-4
>> > then "if (newCapacity - MAX_ARRAY_SIZE > 0)" ---> false
>> > then Arrays.copyOf(queue, newCapacity) --->
>> > ArrayIndexOutOfBoundsException
>>
>> How could newCapacity ever become -4?
>> Since growth is by 50%. ?But even 100% looks safe...
>>
>> > Am I wrong ?
>> >
>> > 2.) Why don't you prefer a system-wide constant for MAX_ARRAY_SIZE ???
>>
>> This should never become a public API - it's a bug in the VM.
>>
>> I prefer the duplication of code to creating a new external dependency.
>>
>> Martin
>>
>> > -Ulf
>
>


From Ulf.Zibis at gmx.de  Tue Mar  9 20:25:28 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 09 Mar 2010 21:25:28 +0100
Subject: Bugs in java.util.ArrayList,
	java.util.Hashtable and 	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>	
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>	
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>	
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>	
	<4B96381E.6030902@gmx.de>
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
Message-ID: <4B96AEB8.4070406@gmx.de>

Am 09.03.2010 20:18, schrieb Martin Buchholz:
> On Tue, Mar 9, 2010 at 03:59, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> In PriorityQueue:
>>
>> let's result newCapacity in 0xFFFF.FFFC  =-4
>> then "if (newCapacity - MAX_ARRAY_SIZE>  0)" --->  false
>> then Arrays.copyOf(queue, newCapacity) --->  ArrayIndexOutOfBoundsException
>>      
> How could newCapacity ever become -4?
> Since growth is by 50%.

Oops, I must admit, that I didn't evaluate that.
Many tricks are interwoven here at one place.
I think, those magic should be better commented. Isn't PriorityQueue a 
public API, visible to everybody?
As said, I think Hotspot compiler would be the better place to optimize 
those if...else branches.

>    But even 100% looks safe...
>    

Hm, having oldCapacity = 0x7FFF.FFFE + 100 % makes 0xFFFF.FFFC

>    
>> Am I wrong ?
>>
>> 2.) Why don't you prefer a system-wide constant for MAX_ARRAY_SIZE ???
>>      
> This should never become a public API - it's a bug in the VM.
>
> I prefer the duplication of code to creating a new external dependency.
>    

Good use case for new super package facility.

I can sympathise your reserve. On the other hand ...
- if there is a limit, developers should have a chance, to evaluate 
against it to avoid OutOfMemoryError.
- maybe other VM's have a other/much lower limit, e.g. on small mobile 
systems.
- If the bug would be fixed, who takes care about the "garbage 
collection" in the code base?
- There is many public stuff in sun.misc.VM class, why not 
MAX_ARRAY_SIZE/maxArraySize()?

-Ulf


From Ulf.Zibis at gmx.de  Tue Mar  9 21:08:15 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 09 Mar 2010 22:08:15 +0100
Subject: Bugs in java.util.ArrayList,
	java.util.Hashtable and 	java.io.ByteArrayOutputStream
In-Reply-To: <fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>	
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>	
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>	
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>	
	<4B96381E.6030902@gmx.de>	
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
	<fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>
Message-ID: <4B96B8BF.2020001@gmx.de>

Am 09.03.2010 21:02, schrieb Osvaldo Doederlein:
> Should we really consider this a VM bug? I'm not sure that it's a good 
> idea to allocate a single object which size exceeds 4Gb (for a byte[] 
> - due to the object header and array size field) - even on a 64-bit 
> VM. An array with 2^32 elements is impossible, the maximum allowed by 
> the size field is 2^32-1 which will be just as bad as 2^32-N for any 
> other tiny positive N, for algorithms that love arrays of [base-2-] 
> "round" sizes.
>
> And then if this bug is fixed, it may have slightly different 
> variations. For a long[] or double[] array, the allocation for the 
> maximum size would exceed 32Gb, so it exceeds the maximum heap size 
> for 64-bit HotSpot with CompressedOops. (Ok this is an artificial 
> issue because we won't like have a 100% free heap, so the only 
> impediment for "new long[2^32-1]" would be the array header.)
>
> My suggestion: impose some fixed N (maybe 64, or 0x100, ...), limiting 
> arrays to 2^32-N (for ANY element type). The artificial restriction 
> should be large enough to fit the object header of any vendor's JVM, 
> plus the per-object overhead of any reasonable heap structure. This 
> limit could be added to the spec, so the implementation is not a bug 
> anymore :) and it would be a portable limit. Otherwise, some app may 
> work reliably on HotSpot if it relies on the fact that 2^32-5 
> positions are possible, but may break on some other vendor's JVM where 
> perhaps the implementation limit is 2^32-13 or something else.
>

Please allow to correct:
it's 231-N !

...but +1 for your arguments.

In [base-2-] "round" sense, why there is the "+1" in [1] ?

I think [2] would look best. I'm sure, HotSpot anyway would optimize to 
(oldCapacity + (oldCapacity >> 1))
Look at the HotSpot disassembly for String#hashCode(), h*31 becomes h<<5-h.

-Ulf


[1] current PriorityQueue snippet:
...
         int newCapacity = ((oldCapacity < 64)?
                            ((oldCapacity + 1) * 2):
                            ((oldCapacity / 2) * 3));
...
[2] new PriorityQueue snippet:
...
         int newCapacity += (oldCapacity < 64) ?
                            oldCapacity : oldCapacity / 2;
...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100309/acc9548e/attachment.html>

From martinrb at google.com  Tue Mar  9 22:08:46 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 9 Mar 2010 14:08:46 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <4B96B8BF.2020001@gmx.de>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<4B96381E.6030902@gmx.de>
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
	<fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>
	<4B96B8BF.2020001@gmx.de>
Message-ID: <1ccfd1c11003091408g4eb63e1ckd92ef6156846086b@mail.gmail.com>

On Tue, Mar 9, 2010 at 13:08, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>
> [1] current PriorityQueue snippet:
> ...
> ??????? int newCapacity = ((oldCapacity < 64)?
> ?????????????????????????? ((oldCapacity + 1) * 2):
> ?????????????????????????? ((oldCapacity / 2) * 3));
> ...
> [2] new PriorityQueue snippet:
> ...
> ??????? int newCapacity += (oldCapacity < 64) ?
> ?????????????????????????? oldCapacity : oldCapacity / 2;
> ...

Thanks, I took your suggestion and changed it to:

        int newCapacity = oldCapacity + ((oldCapacity < 64) ?
                                         (oldCapacity + 2) :
                                         (oldCapacity >> 1));


Martin


From Ulf.Zibis at gmx.de  Tue Mar  9 23:11:06 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 10 Mar 2010 00:11:06 +0100
Subject: Bugs in java.util.ArrayList,
	java.util.Hashtable and 	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003091408g4eb63e1ckd92ef6156846086b@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>	
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>	
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>	
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>	
	<4B96381E.6030902@gmx.de>	
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>	
	<fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>	
	<4B96B8BF.2020001@gmx.de>
	<1ccfd1c11003091408g4eb63e1ckd92ef6156846086b@mail.gmail.com>
Message-ID: <4B96D58A.1050201@gmx.de>

Am 09.03.2010 23:08, schrieb Martin Buchholz:
> On Tue, Mar 9, 2010 at 13:08, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> [1] current PriorityQueue snippet:
>> ...
>>          int newCapacity = ((oldCapacity<  64)?
>>                             ((oldCapacity + 1) * 2):
>>                             ((oldCapacity / 2) * 3));
>> ...
>> [2] new PriorityQueue snippet:
>> ...
>>          int newCapacity += (oldCapacity<  64) ?
>>                             oldCapacity : oldCapacity / 2;
>> ...
>>      
> Thanks, I took your suggestion and changed it to:
>
>          int newCapacity = oldCapacity + ((oldCapacity<  64) ?
>                                           (oldCapacity + 2) :
>                                           (oldCapacity>>  1));
>    

Oops :-[

Can you explain the mystery about "+ 2" ?

-Ulf


From martinrb at google.com  Tue Mar  9 23:22:49 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 9 Mar 2010 15:22:49 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <4B96D58A.1050201@gmx.de>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<4B96381E.6030902@gmx.de>
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
	<fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>
	<4B96B8BF.2020001@gmx.de>
	<1ccfd1c11003091408g4eb63e1ckd92ef6156846086b@mail.gmail.com>
	<4B96D58A.1050201@gmx.de>
Message-ID: <1ccfd1c11003091522l66b7d92ndea2e2105536bb8c@mail.gmail.com>

On Tue, Mar 9, 2010 at 15:11, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:

> Can you explain the mystery about "+ 2" ?

It's exactly the same as the old resizing behavior.

Martin


From martinrb at google.com  Wed Mar 10 01:05:06 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 9 Mar 2010 17:05:06 -0800
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4B92C263.9020404@gmx.de>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
Message-ID: <1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>

Here's the proposed fix for
6931812: A better implementation of sun.nio.cs.Surrogate.isBMP(int)

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint/

I changed the name to isBMPCodePoint in preparation for moving
it to Character.java.
(Sherman, perhaps you would like to take on that followon task?)

Sherman, please approve.

Martin

On Sat, Mar 6, 2010 at 13:00, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Very fast Sherman, much thanks.
>
> Could you set the bug to accepted and evaluated, so my patch will have a
> chance to get into the code base?
>
> -Ulf
>
>
> Am 03.03.2010 20:11, schrieb Xueming Shen:
>>
>> #6931812
>>
>> Martin Buchholz wrote:
>>>
>>> Sherman, would you like to file bugs for Ulf's improvements?
>>>
>>> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>
>>>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>>
>>>>> Keep in mind that supplementary characters are extremely rare.
>>>>>
>>>> Yes, but many API's in the JDK are used rarely.
>>>> Why should they waste memory footprint / perform bad, particularly if it
>>>> doesn't cost anything.
>>>
>>> I admire your perfectionism.
>>>
>>>>> Therefore the existing implementation
>>>>>
>>>>> ?return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>> && ?codePoint<= MAX_CODE_POINT;
>>>>>
>>>>> will almost always perform just one comparison against a constant,
>>>>> which is hard to beat.
>>>>>
>>>> 1. Wondering: I think there are TWO comparisons.
>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>> against
>>>> only 8 bit values in my case.
>>>
>>> It's a good point. ?In the machine code, shifts are likely to use
>>> immediate values, and so will be a small win.
>>>
>>> int x = codePoint >>> 16;
>>> return x != 0 && x < 0x11;
>>>
>>> (On modern hardware, these optimizations
>>> are less valuable than they used to be;
>>> ordinary integer arithmetic is almost free)
>>>
>>> Martin
>>
>>
>
>


From gokdogan at gmail.com  Wed Mar 10 09:58:38 2010
From: gokdogan at gmail.com (Goktug Gokdogan)
Date: Wed, 10 Mar 2010 03:58:38 -0600
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
Message-ID: <f5b3aa371003100158i8fbfccahbc5a5bf87420feec@mail.gmail.com>

Similarly,
  BitSet.ensureCapacity
  AbstractStringBuilder.expandCapacity
  Vector.ensureCapacityHelper
methods need to have similar checks and/or throw proper exceptions.

By the way, I did not understand why IdentityHashMap and HashMap have
different MAXIMUM_CAPACITY and different logic to handle resize and
overflow.


On Mon, Mar 8, 2010 at 8:10 PM, Martin Buchholz <martinrb at google.com> wrote:

> [Chris or Alan, please review and file a bug]
>
> OK, guys,
>
> Here's a patch:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>
> Martin
>
> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
> > Hi Martin,
> >
> > Thank you for your reply.  If I may, PriorityQueue appears to employ the
> > simple strategy that I suggested above in its grow method:
> >
> >         int newCapacity = ((oldCapacity < 64)?
> >                            ((oldCapacity + 1) * 2):
> >                            ((oldCapacity / 2) * 3));
> >         if (newCapacity < 0) // overflow
> >             newCapacity = Integer.MAX_VALUE;
> >
> > It might be desirable to set a common strategy for capacity increase for
> all
> > collections.
> >
> > Regards,
> >
> > Kevin
> >
> > On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <martinrb at google.com>
> wrote:
> >>
> >> Hi Kevin,
> >>
> >> As you've noticed, creating objects within a factor of two of
> >> their natural limits is a good way to expose lurking bugs.
> >>
> >> I'm the one responsible for the algorithm in ArrayList.
> >> I'm a bit embarrassed, looking at that code today.
> >> We could set the array size to Integer.MAX_VALUE,
> >> but then you might hit an independent buglet in hotspot
> >> that you cannot allocate an array with Integer.MAX_VALUE
> >> elements, but Integer.MAX_VALUE - 5 (or so) works.
> >>
> >> It occurs to me that increasing the size by 50% is better done by
> >> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
> >>
> >> I agree with the plan of setting the capacity to something near
> >> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
> >>
> >> These bugs are not known.
> >> Chris Hegarty, could you file a bug for us?
> >>
> >> Martin
> >>
> >> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.stern at gmail.com>
> >> wrote:
> >> > Greetings,
> >> >
> >> > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
> >> > java.io.ByteArrayOutputStream which arise when the capacities of the
> >> > data
> >> > structures reach a particular threshold.  More below.
> >> >
> >> > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
> >> > size
> >> > reaches its capacity and an add or an insert operation is invoked, the
> >> > capacity is increased by only one element.  Notice that in the
> following
> >> > excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2)
> *
> >> > oldCapacity + 1 unless this value would not suffice to accommodate the
> >> > required capacity in which case it is set to the required capacity.
> If
> >> > the
> >> > current capacity is at least (2/3)*Integer.MAX_VALUE, then
> (oldCapacity
> >> > *
> >> > 3)/2 + 1 overflows and resolves to a negative number resulting in the
> >> > new
> >> > capacity being set to the required capacity.  The major consequence of
> >> > this
> >> > is that each subsequent add/insert operation results in a full resize
> of
> >> > the
> >> > ArrayList causing performance to degrade significantly.
> >> >
> >> >         int newCapacity = (oldCapacity * 3)/2 + 1;
> >> >             if (newCapacity < minCapacity)
> >> >         newCapacity = minCapacity;
> >> >
> >> > Hashtable breaks entirely when the size of its backing array reaches
> >> > (1/2) *
> >> > Integer.MAX_VALUE and a rehash is necessary as is evident from the
> >> > following
> >> > excerpt from rehash.  Notice that rehash will attempt to create an
> array
> >> > of
> >> > negative size if the size of the backing array reaches (1/2) *
> >> > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to
> a
> >> > negative number.
> >> >
> >> >     int newCapacity = oldCapacity * 2 + 1;
> >> >     HashtableEntry newTable[] = new HashtableEntry[newCapacity];
> >> >
> >> > When the capacity of the backing array in a ByteArrayOutputStream
> >> > reaches
> >> > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
> >> > operation is invoked, the capacity of the backing array is increased
> >> > only by
> >> > the required number of elements.  Notice that in the following excerpt
> >> > from
> >> > ByteArrayOutputStream.write(int) the new backing array capacity is set
> >> > to 2
> >> > * buf.length unless this value would not suffice to accommodate the
> >> > required
> >> > capacity in which case it is set to the required capacity.  If the
> >> > current
> >> > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
> >> > buf.length << 1 overflows and resolves to a negative number resulting
> in
> >> > the
> >> > new capacity being set to the required capacity.  The major
> consequence
> >> > of
> >> > this, like with ArrayList, is that each subsequent write operation
> >> > results
> >> > in a full resize of the ByteArrayOutputStream causing performance to
> >> > degrade
> >> > significantly.
> >> >
> >> >     int newcount = count + 1;
> >> >     if (newcount > buf.length) {
> >> >             buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
> >> > newcount));
> >> >     }
> >> >
> >> > It is interesting to note that any statements about the amortized time
> >> > complexity of add/insert operations, such as the one in the ArrayList
> >> > javadoc, are invalidated by the performance related bugs.  One
> solution
> >> > to
> >> > the above situations is to set the new capacity of the backing array
> to
> >> > Integer.MAX_VALUE when the initial size calculation results in a
> >> > negative
> >> > number during a resize.
> >> >
> >> > Apologies if these bugs are already known.
> >> >
> >> > Regards,
> >> >
> >> > Kevin
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100310/6c75ac76/attachment.html>

From christopher.hegarty at sun.com  Wed Mar 10 14:47:32 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Wed, 10 Mar 2010 14:47:32 +0000
Subject: hg: jdk7/tl/jdk: 6933618:
	java/net/MulticastSocket/NoLoopbackPackets.java fails when rerun
Message-ID: <20100310144817.62C2D445BC@hg.openjdk.java.net>

Changeset: 47958f76babc
Author:    chegar
Date:      2010-03-10 14:44 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/47958f76babc

6933618: java/net/MulticastSocket/NoLoopbackPackets.java fails when rerun
Reviewed-by: alanb

! test/java/net/MulticastSocket/NoLoopbackPackets.java


From Ulf.Zibis at gmx.de  Wed Mar 10 17:36:48 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 10 Mar 2010 18:36:48 +0100
Subject: Bugs in java.util.ArrayList,
	java.util.Hashtable and java.io.ByteArrayOutputStream
In-Reply-To: <1ccfd1c11003091522l66b7d92ndea2e2105536bb8c@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>	
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>	
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>	
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>	
	<4B96381E.6030902@gmx.de>	
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>	
	<fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>	
	<4B96B8BF.2020001@gmx.de>	
	<1ccfd1c11003091408g4eb63e1ckd92ef6156846086b@mail.gmail.com>	
	<4B96D58A.1050201@gmx.de>
	<1ccfd1c11003091522l66b7d92ndea2e2105536bb8c@mail.gmail.com>
Message-ID: <4B97D8B0.7020604@gmx.de>

Am 10.03.2010 00:22, schrieb Martin Buchholz:
> On Tue, Mar 9, 2010 at 15:11, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>
>    
>> Can you explain the mystery about "+ 2" ?
>>      
> It's exactly the same as the old resizing behavior.

In detail I meant, if you have any idea, why the original designers 
could have chosen the "+1".
The code would be smarter, if ommited, + would serve the algorithm-loved 
arrays of [base-2-] "round" sizes.

-Ulf


From Ulf.Zibis at gmx.de  Wed Mar 10 17:58:11 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 10 Mar 2010 18:58:11 +0100
Subject: Progress of patches
In-Reply-To: <4B96F476.1060409@gmx.de>
References: <4B96E323.80505@gmx.de>
	<1ccfd1c11003091710u1e2ec8cevf64110ee3af2d88b@mail.gmail.com>
	<4B96F476.1060409@gmx.de>
Message-ID: <4B97DDB3.2080107@gmx.de>

Hi Martin,

there wasn't enough time today, so please wait for tomorrow.

In brief:
- I wouldn't rename to isBMPCodePoint(), because there are many other 
names in Surrogate class that don't sync to Character and and a usages 
search in sun.nio.cs.* or where ever else could be omitted. Better add 
"//  return Character.isBMPCodePoint(uc);" as hint for the future.
- Thanks for mention me as contributor.
- Doesn't the bug description include the addition of isBMPCodePoint() 
to class Character and the equivalent enhancement to 
isSupplementaryCodePoint() ?

-Ulf


Am 10.03.2010 02:23, schrieb Ulf Zibis:
> Much thanks Martin,
>
> I'll do it tomorrow. Now it's time to sleep here in Germany.
>
> -Ulf
>
>
> Am 10.03.2010 02:10, schrieb Martin Buchholz:
>> Do you have a collection of patch files to be upstreamed?
>>
>> Easiest for me would be a publicly readable
>> directory of patches like I maintain on
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/
>>
>> that I can
>> hg qimport
>> into my own mq outgoing jdk repo.
>>
>> (Thanks for your hard work)
>>
>> Martin
>>


From Xueming.Shen at Sun.COM  Wed Mar 10 18:23:57 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Wed, 10 Mar 2010 10:23:57 -0800
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
Message-ID: <4B97E3BD.2000901@sun.com>

approved.

I don't have a spare ws right now.so please just push, it's almost there:-)

sherman

Martin Buchholz wrote:
> Here's the proposed fix for
> 6931812: A better implementation of sun.nio.cs.Surrogate.isBMP(int)
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint/
>
> I changed the name to isBMPCodePoint in preparation for moving
> it to Character.java.
> (Sherman, perhaps you would like to take on that followon task?)
>
> Sherman, please approve.
>
> Martin
>
> On Sat, Mar 6, 2010 at 13:00, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>   
>> Very fast Sherman, much thanks.
>>
>> Could you set the bug to accepted and evaluated, so my patch will have a
>> chance to get into the code base?
>>
>> -Ulf
>>
>>
>> Am 03.03.2010 20:11, schrieb Xueming Shen:
>>     
>>> #6931812
>>>
>>> Martin Buchholz wrote:
>>>       
>>>> Sherman, would you like to file bugs for Ulf's improvements?
>>>>
>>>> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>         
>>>>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>>>>           
>>>>>> Keep in mind that supplementary characters are extremely rare.
>>>>>>
>>>>>>             
>>>>> Yes, but many API's in the JDK are used rarely.
>>>>> Why should they waste memory footprint / perform bad, particularly if it
>>>>> doesn't cost anything.
>>>>>           
>>>> I admire your perfectionism.
>>>>
>>>>         
>>>>>> Therefore the existing implementation
>>>>>>
>>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>> &&  codePoint<= MAX_CODE_POINT;
>>>>>>
>>>>>> will almost always perform just one comparison against a constant,
>>>>>> which is hard to beat.
>>>>>>
>>>>>>             
>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>> against
>>>>> only 8 bit values in my case.
>>>>>           
>>>> It's a good point.  In the machine code, shifts are likely to use
>>>> immediate values, and so will be a small win.
>>>>
>>>> int x = codePoint >>> 16;
>>>> return x != 0 && x < 0x11;
>>>>
>>>> (On modern hardware, these optimizations
>>>> are less valuable than they used to be;
>>>> ordinary integer arithmetic is almost free)
>>>>
>>>> Martin
>>>>         
>>>       
>>     


From martinrb at google.com  Wed Mar 10 23:14:43 2010
From: martinrb at google.com (martinrb at google.com)
Date: Wed, 10 Mar 2010 23:14:43 +0000
Subject: hg: jdk7/tl/jdk: 6931812: A better implementation of
	sun.nio.cs.Surrogate.isBMP(int)
Message-ID: <20100310231507.8C5B944633@hg.openjdk.java.net>

Changeset: 467484e025d6
Author:    martin
Date:      2010-03-10 14:53 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/467484e025d6

6931812: A better implementation of sun.nio.cs.Surrogate.isBMP(int)
Summary: uc >> 16 == 0 is superior to (int) (char) uc == uc
Reviewed-by: sherman
Contributed-by: Ulf Zibis <ulf.zibis at gmx.de>

! src/share/classes/sun/nio/cs/Surrogate.java


From jonathan.gibbons at sun.com  Thu Mar 11 00:24:36 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Thu, 11 Mar 2010 00:24:36 +0000
Subject: hg: jdk7/tl/langtools: 6933914: fix missing newlines
Message-ID: <20100311002440.1FD1244644@hg.openjdk.java.net>

Changeset: 9871ce4fd56f
Author:    jjg
Date:      2010-03-10 16:23 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/9871ce4fd56f

6933914: fix missing newlines
Reviewed-by: ohair

! test/tools/javac/OverrideChecks/6738538/T6738538a.java
! test/tools/javac/OverrideChecks/6738538/T6738538b.java
! test/tools/javac/api/6731573/Erroneous.java
! test/tools/javac/api/6731573/T6731573.java
! test/tools/javac/cast/6548436/T6548436d.java
! test/tools/javac/cast/6558559/T6558559a.java
! test/tools/javac/cast/6558559/T6558559b.java
! test/tools/javac/cast/6586091/T6586091.java
! test/tools/javac/enum/T6724345.java
! test/tools/javac/generics/T6557954.java
! test/tools/javac/generics/T6751514.java
! test/tools/javac/generics/T6869075.java
! test/tools/javac/generics/inference/6569789/T6569789.java
! test/tools/javac/generics/inference/6650759/T6650759a.java
! test/tools/javac/generics/wildcards/T6732484.java
! test/tools/javac/processing/model/util/elements/Foo.java
! test/tools/javac/varargs/T6746184.java
- test/tools/javap/T6305779.java
! test/tools/javap/T6715251.java
! test/tools/javap/T6715753.java


From martinrb at google.com  Thu Mar 11 00:48:10 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 10 Mar 2010 16:48:10 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <4B97D8B0.7020604@gmx.de>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<4B96381E.6030902@gmx.de>
	<1ccfd1c11003091118m3a0106cap88c7af89d01b6bf8@mail.gmail.com>
	<fb5ec5091003091202tf86421fr7561e8012e285a10@mail.gmail.com>
	<4B96B8BF.2020001@gmx.de>
	<1ccfd1c11003091408g4eb63e1ckd92ef6156846086b@mail.gmail.com>
	<4B96D58A.1050201@gmx.de>
	<1ccfd1c11003091522l66b7d92ndea2e2105536bb8c@mail.gmail.com>
	<4B97D8B0.7020604@gmx.de>
Message-ID: <1ccfd1c11003101648v661c69d6n270dfde650cbdaaa@mail.gmail.com>

On Wed, Mar 10, 2010 at 09:36, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 10.03.2010 00:22, schrieb Martin Buchholz:
>>
>> On Tue, Mar 9, 2010 at 15:11, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>
>>
>>>
>>> Can you explain the mystery about "+ 2" ?
>>>
>>
>> It's exactly the same as the old resizing behavior.
>
> In detail I meant, if you have any idea, why the original designers could
> have chosen the "+1".
> The code would be smarter, if ommited, + would serve the algorithm-loved
> arrays of [base-2-] "round" sizes.

I bet what happened is that the +2 is necessary for an initial capacity of 0.
It turns out that the current implementation disallows this,
so it it possible to simply double the size, but I am not going to
change it now.

On the other hand, you could consider it a feature
that very small arrays should grow more rapidly than a factor of two.

Martin


From martinrb at google.com  Thu Mar 11 01:03:32 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 10 Mar 2010 17:03:32 -0800
Subject: Bugs in java.util.ArrayList, java.util.Hashtable and 
	java.io.ByteArrayOutputStream
In-Reply-To: <f5b3aa371003100158i8fbfccahbc5a5bf87420feec@mail.gmail.com>
References: <1704b7a21003031741m734545f1gb0170ed5fa6f6d68@mail.gmail.com>
	<1ccfd1c11003050104u61e776apc5fe2e5ec08e3dc0@mail.gmail.com>
	<1704b7a21003050248k1e893cedmd14f26cbecd45896@mail.gmail.com>
	<1ccfd1c11003081810u54fb22e6k25230f4eb5ca1b18@mail.gmail.com>
	<f5b3aa371003100158i8fbfccahbc5a5bf87420feec@mail.gmail.com>
Message-ID: <1ccfd1c11003101703g42708603j1cedeeae8c74b68c@mail.gmail.com>

On Wed, Mar 10, 2010 at 01:58, Goktug Gokdogan <gokdogan at gmail.com> wrote:
> Similarly,
> ??BitSet.ensureCapacity

I don't think BitSet has this problem, because the bits are stored in longs,
so the array can never overflow.  But don't believe me - prove me wrong!

> ??AbstractStringBuilder.expandCapacity

Yup.

> ??Vector.ensureCapacityHelper

Yup.

The scope of this fix is growing...

> methods need to have similar checks and/or throw proper exceptions.
>
> By the way, I did not understand why?IdentityHashMap and HashMap have
> different MAXIMUM_CAPACITY and different logic to handle resize and
> overflow.

These two classes store their data in completely different ways.
In particular, HashMap need never fail because of overflow ;
Integer.MAX_VALUE buckets should be enough for anybody!

Webrev regenerated.
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/

Martin

>
> On Mon, Mar 8, 2010 at 8:10 PM, Martin Buchholz <martinrb at google.com> wrote:
>>
>> [Chris or Alan, please review and file a bug]
>>
>> OK, guys,
>>
>> Here's a patch:
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>>
>> Martin
>>
>> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.stern at gmail.com>
>> wrote:
>> > Hi Martin,
>> >
>> > Thank you for your reply.? If I may, PriorityQueue appears to employ the
>> > simple strategy that I suggested above in its grow method:
>> >
>> > ??????? int newCapacity = ((oldCapacity < 64)?
>> > ?????????????????????????? ((oldCapacity + 1) * 2):
>> > ?????????????????????????? ((oldCapacity / 2) * 3));
>> > ??????? if (newCapacity < 0) // overflow
>> > ??????????? newCapacity = Integer.MAX_VALUE;
>> >
>> > It might be desirable to set a common strategy for capacity increase for
>> > all
>> > collections.
>> >
>> > Regards,
>> >
>> > Kevin
>> >
>> > On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <martinrb at google.com>
>> > wrote:
>> >>
>> >> Hi Kevin,
>> >>
>> >> As you've noticed, creating objects within a factor of two of
>> >> their natural limits is a good way to expose lurking bugs.
>> >>
>> >> I'm the one responsible for the algorithm in ArrayList.
>> >> I'm a bit embarrassed, looking at that code today.
>> >> We could set the array size to Integer.MAX_VALUE,
>> >> but then you might hit an independent buglet in hotspot
>> >> that you cannot allocate an array with Integer.MAX_VALUE
>> >> elements, but Integer.MAX_VALUE - 5 (or so) works.
>> >>
>> >> It occurs to me that increasing the size by 50% is better done by
>> >> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
>> >>
>> >> I agree with the plan of setting the capacity to something near
>> >> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>> >>
>> >> These bugs are not known.
>> >> Chris Hegarty, could you file a bug for us?
>> >>
>> >> Martin
>> >>
>> >> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.stern at gmail.com>
>> >> wrote:
>> >> > Greetings,
>> >> >
>> >> > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
>> >> > java.io.ByteArrayOutputStream which arise when the capacities of the
>> >> > data
>> >> > structures reach a particular threshold.? More below.
>> >> >
>> >> > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
>> >> > size
>> >> > reaches its capacity and an add or an insert operation is invoked,
>> >> > the
>> >> > capacity is increased by only one element.? Notice that in the
>> >> > following
>> >> > excerpt from ArrayList.ensureCapacity the new capacity is set to
>> >> > (3/2) *
>> >> > oldCapacity + 1 unless this value would not suffice to accommodate
>> >> > the
>> >> > required capacity in which case it is set to the required capacity.
>> >> > If
>> >> > the
>> >> > current capacity is at least (2/3)*Integer.MAX_VALUE, then
>> >> > (oldCapacity
>> >> > *
>> >> > 3)/2 + 1 overflows and resolves to a negative number resulting in the
>> >> > new
>> >> > capacity being set to the required capacity.? The major consequence
>> >> > of
>> >> > this
>> >> > is that each subsequent add/insert operation results in a full resize
>> >> > of
>> >> > the
>> >> > ArrayList causing performance to degrade significantly.
>> >> >
>> >> > ??? ??? int newCapacity = (oldCapacity * 3)/2 + 1;
>> >> > ??? ??? ??? if (newCapacity < minCapacity)
>> >> > ??? ??? newCapacity = minCapacity;
>> >> >
>> >> > Hashtable breaks entirely when the size of its backing array reaches
>> >> > (1/2) *
>> >> > Integer.MAX_VALUE and a rehash is necessary as is evident from the
>> >> > following
>> >> > excerpt from rehash.? Notice that rehash will attempt to create an
>> >> > array
>> >> > of
>> >> > negative size if the size of the backing array reaches (1/2) *
>> >> > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to
>> >> > a
>> >> > negative number.
>> >> >
>> >> > ??? int newCapacity = oldCapacity * 2 + 1;
>> >> > ??? HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>> >> >
>> >> > When the capacity of the backing array in a ByteArrayOutputStream
>> >> > reaches
>> >> > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
>> >> > operation is invoked, the capacity of the backing array is increased
>> >> > only by
>> >> > the required number of elements.? Notice that in the following
>> >> > excerpt
>> >> > from
>> >> > ByteArrayOutputStream.write(int) the new backing array capacity is
>> >> > set
>> >> > to 2
>> >> > * buf.length unless this value would not suffice to accommodate the
>> >> > required
>> >> > capacity in which case it is set to the required capacity.? If the
>> >> > current
>> >> > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1,
>> >> > then
>> >> > buf.length << 1 overflows and resolves to a negative number resulting
>> >> > in
>> >> > the
>> >> > new capacity being set to the required capacity.? The major
>> >> > consequence
>> >> > of
>> >> > this, like with ArrayList, is that each subsequent write operation
>> >> > results
>> >> > in a full resize of the ByteArrayOutputStream causing performance to
>> >> > degrade
>> >> > significantly.
>> >> >
>> >> > ??? int newcount = count + 1;
>> >> > ??? if (newcount > buf.length) {
>> >> > ??????????? buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
>> >> > newcount));
>> >> > ??? }
>> >> >
>> >> > It is interesting to note that any statements about the amortized
>> >> > time
>> >> > complexity of add/insert operations, such as the one in the ArrayList
>> >> > javadoc, are invalidated by the performance related bugs.? One
>> >> > solution
>> >> > to
>> >> > the above situations is to set the new capacity of the backing array
>> >> > to
>> >> > Integer.MAX_VALUE when the initial size calculation results in a
>> >> > negative
>> >> > number during a resize.
>> >> >
>> >> > Apologies if these bugs are already known.
>> >> >
>> >> > Regards,
>> >> >
>> >> > Kevin
>> >> >
>> >
>> >
>
>


From martinrb at google.com  Thu Mar 11 01:59:31 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 10 Mar 2010 17:59:31 -0800
Subject: Progress of patches
In-Reply-To: <4B97DDB3.2080107@gmx.de>
References: <4B96E323.80505@gmx.de>
	<1ccfd1c11003091710u1e2ec8cevf64110ee3af2d88b@mail.gmail.com>
	<4B96F476.1060409@gmx.de> <4B97DDB3.2080107@gmx.de>
Message-ID: <1ccfd1c11003101759g5f28ec2dhfd1a220ed6758880@mail.gmail.com>

On Wed, Mar 10, 2010 at 09:58, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Hi Martin,
>
> there wasn't enough time today, so please wait for tomorrow.
>
> In brief:
> - I wouldn't rename to isBMPCodePoint(), because there are many other names
> in Surrogate class that don't sync to Character and and a usages search in
> sun.nio.cs.* or where ever else could be omitted. Better add "// ?return
> Character.isBMPCodePoint(uc);" as hint for the future.
> - Thanks for mention me as contributor.
> - Doesn't the bug description include the addition of isBMPCodePoint() to
> class Character and the equivalent enhancement to isSupplementaryCodePoint()
> ?

Sorry, I should have included the fix to isSupplementaryCodePoint()
in the last fix.

Here's the next fix:

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/

6666666: A better implementation of Character.isSupplementaryCodePoint
Summary: Clever bit-twiddling saves a few bytes of machine code
Reviewed-by: sherman
Contributed-by: Ulf Zibis <Ulf.Zibis at gmx.de>
diff --git a/src/share/classes/java/lang/Character.java
b/src/share/classes/java/lang/Character.java
--- a/src/share/classes/java/lang/Character.java
+++ b/src/share/classes/java/lang/Character.java
@@ -2693,8 +2693,8 @@
      * @since  1.5
      */
     public static boolean isSupplementaryCodePoint(int codePoint) {
-        return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT
-            && codePoint <= MAX_CODE_POINT;
+        int plane = codePoint >>> 16;
+        return plane != 0 && plane < ((MAX_CODE_POINT + 1) >>> 16);
     }

     /**


From martinrb at google.com  Thu Mar 11 04:42:09 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 10 Mar 2010 20:42:09 -0800
Subject: Progress of patches
In-Reply-To: <1ccfd1c11003101759g5f28ec2dhfd1a220ed6758880@mail.gmail.com>
References: <4B96E323.80505@gmx.de>
	<1ccfd1c11003091710u1e2ec8cevf64110ee3af2d88b@mail.gmail.com>
	<4B96F476.1060409@gmx.de> <4B97DDB3.2080107@gmx.de>
	<1ccfd1c11003101759g5f28ec2dhfd1a220ed6758880@mail.gmail.com>
Message-ID: <1ccfd1c11003102042n192885b1ld3859b0f5311e732@mail.gmail.com>

I couldn't resist making a similar change to isValidCodePoint.

@@ -2678,7 +2678,8 @@
      * @since  1.5
      */
     public static boolean isValidCodePoint(int codePoint) {
-        return codePoint >= MIN_CODE_POINT && codePoint <= MAX_CODE_POINT;
+        int plane = codePoint >>> 16;
+        return plane < ((MAX_CODE_POINT + 1) >>> 16);
     }

     /**

This is a more important optimization, since isValidCodePoint
almost always requires two compares, and this reduces it to one.
(Still, none of these are really important, and no one will notice)

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/

Martin

On Wed, Mar 10, 2010 at 17:59, Martin Buchholz <martinrb at google.com> wrote:
> On Wed, Mar 10, 2010 at 09:58, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>> Hi Martin,
>>
>> there wasn't enough time today, so please wait for tomorrow.
>>
>> In brief:
>> - I wouldn't rename to isBMPCodePoint(), because there are many other names
>> in Surrogate class that don't sync to Character and and a usages search in
>> sun.nio.cs.* or where ever else could be omitted. Better add "// ?return
>> Character.isBMPCodePoint(uc);" as hint for the future.
>> - Thanks for mention me as contributor.
>> - Doesn't the bug description include the addition of isBMPCodePoint() to
>> class Character and the equivalent enhancement to isSupplementaryCodePoint()
>> ?
>
> Sorry, I should have included the fix to isSupplementaryCodePoint()
> in the last fix.
>
> Here's the next fix:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
>
> 6666666: A better implementation of Character.isSupplementaryCodePoint
> Summary: Clever bit-twiddling saves a few bytes of machine code
> Reviewed-by: sherman
> Contributed-by: Ulf Zibis <Ulf.Zibis at gmx.de>
> diff --git a/src/share/classes/java/lang/Character.java
> b/src/share/classes/java/lang/Character.java
> --- a/src/share/classes/java/lang/Character.java
> +++ b/src/share/classes/java/lang/Character.java
> @@ -2693,8 +2693,8 @@
> ? ? ?* @since ?1.5
> ? ? ?*/
> ? ? public static boolean isSupplementaryCodePoint(int codePoint) {
> - ? ? ? ?return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT
> - ? ? ? ? ? ?&& codePoint <= MAX_CODE_POINT;
> + ? ? ? ?int plane = codePoint >>> 16;
> + ? ? ? ?return plane != 0 && plane < ((MAX_CODE_POINT + 1) >>> 16);
> ? ? }
>
> ? ? /**
>


From Xueming.Shen at Sun.COM  Thu Mar 11 06:47:44 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Wed, 10 Mar 2010 22:47:44 -0800
Subject: Codereview needed for #6929479
In-Reply-To: <4B86D763.8080603@sun.com>
References: <4B86CAE8.3080008@sun.com> <4B86D43D.4000002@sun.com>
	<4B86D6C4.4080600@sun.com> <4B86D763.8080603@sun.com>
Message-ID: <4B989210.4020303@sun.com>

Alan,

webrev has been updated to use the sun.zip.disableMemoryMapping

http://cr.openjdk.java.net/~sherman/6929479/webrev

Please review.

Thanks,
Sherman

Alan Bateman wrote:
> Xueming Shen wrote:
>> :
>> The webrev has been updated to use "sun.zip.disableMmapping", I guess 
>> you meant "sun.zip.disableMmapping", right?
> It's hard to find a good name. I was suggesting disableMapping (no 
> double m) but disableMemoryMapping could work too.
>
> -Alan.


From Alan.Bateman at Sun.COM  Thu Mar 11 15:34:43 2010
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Thu, 11 Mar 2010 15:34:43 +0000
Subject: Codereview needed for #6929479
In-Reply-To: <4B989210.4020303@sun.com>
References: <4B86CAE8.3080008@sun.com> <4B86D43D.4000002@sun.com>
	<4B86D6C4.4080600@sun.com> <4B86D763.8080603@sun.com>
	<4B989210.4020303@sun.com>
Message-ID: <4B990D93.9050806@sun.com>

Xueming Shen wrote:
> Alan,
>
> webrev has been updated to use the sun.zip.disableMemoryMapping
>
> http://cr.openjdk.java.net/~sherman/6929479/webrev
>
> Please review.
>
> Thanks,
> Sherman
I agree it's a better name.  In ZipFile it would be good to put a 
comment at the initialization so that the reader understands what this 
property is about. Minor nit in zip_util.c at L805 is that it looks like 
the indenting it out by one.  In any case, this will be a useful 
debugging option for the next time that someone steps on their own feet.

-Alan


From christopher.hegarty at sun.com  Thu Mar 11 16:20:14 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Thu, 11 Mar 2010 16:20:14 +0000
Subject: hg: jdk7/tl/jdk: 6934054: java/net/Socket/FDClose.java return error
	in samevm
Message-ID: <20100311162054.0273244735@hg.openjdk.java.net>

Changeset: 07e1c5a90c6a
Author:    chegar
Date:      2010-03-11 16:17 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/07e1c5a90c6a

6934054: java/net/Socket/FDClose.java return error in samevm
Summary: test is no longer useful
Reviewed-by: alanb

! test/ProblemList.txt
- test/java/net/Socket/FDClose.java


From christopher.hegarty at sun.com  Thu Mar 11 17:39:36 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Thu, 11 Mar 2010 17:39:36 +0000
Subject: hg: jdk7/tl/jdk: 6933629:
	java/net/HttpURLConnection/HttpResponseCode.java fails if run
	in samevm mode
Message-ID: <20100311173955.C60B444747@hg.openjdk.java.net>

Changeset: c342735a3e58
Author:    chegar
Date:      2010-03-11 17:37 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/c342735a3e58

6933629: java/net/HttpURLConnection/HttpResponseCode.java fails if run in samevm mode
Reviewed-by: alanb

! test/ProblemList.txt
! test/java/net/CookieHandler/CookieHandlerTest.java


From christopher.hegarty at sun.com  Thu Mar 11 17:51:09 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Thu, 11 Mar 2010 17:51:09 +0000
Subject: hg: jdk7/tl/jdk: 6223635: Code hangs at connect call even when
	Timeout is specified when using a socks proxy
Message-ID: <20100311175127.DA7B84474C@hg.openjdk.java.net>

Changeset: c6f8c58ed51a
Author:    chegar
Date:      2010-03-11 17:50 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/c6f8c58ed51a

6223635: Code hangs at connect call even when Timeout is specified when using a socks proxy
Reviewed-by: michaelm, chegar
Contributed-by: damjan.jov at gmail.com

! src/share/classes/java/net/SocketInputStream.java
! src/share/classes/java/net/SocksSocketImpl.java
+ test/java/net/Socket/SocksConnectTimeout.java


From Ulf.Zibis at gmx.de  Thu Mar 11 17:53:43 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 11 Mar 2010 18:53:43 +0100
Subject: Progress of patches
In-Reply-To: <1ccfd1c11003102042n192885b1ld3859b0f5311e732@mail.gmail.com>
References: <4B96E323.80505@gmx.de>	
	<1ccfd1c11003091710u1e2ec8cevf64110ee3af2d88b@mail.gmail.com>	
	<4B96F476.1060409@gmx.de> <4B97DDB3.2080107@gmx.de>	
	<1ccfd1c11003101759g5f28ec2dhfd1a220ed6758880@mail.gmail.com>
	<1ccfd1c11003102042n192885b1ld3859b0f5311e732@mail.gmail.com>
Message-ID: <4B992E27.9000205@gmx.de>

I couldn't resist too  ;-) . See:
https://bugs.openjdk.java.net/attachment.cgi?id=178&action=diff
Download:
https://bugs.openjdk.java.net/attachment.cgi?id=178

Please have in mind:
- the performance advantage as pair only occurs, if isBMPCodePoint too 
uses logical shift '>>>'.
- String(int[] codePoints, int offset, int count) would have to access 
sun.nio.cs.Surrogate, if isBMPCodePoint doesn't exist in Character.

Hopefully you can agree all my changes,

-Ulf


Am 11.03.2010 05:42, schrieb Martin Buchholz:
> I couldn't resist making a similar change to isValidCodePoint.
>
> @@ -2678,7 +2678,8 @@
>        * @since  1.5
>        */
>       public static boolean isValidCodePoint(int codePoint) {
> -        return codePoint>= MIN_CODE_POINT&&  codePoint<= MAX_CODE_POINT;
> +        int plane = codePoint>>>  16;
> +        return plane<  ((MAX_CODE_POINT + 1)>>>  16);
>       }
>
>       /**
>
> This is a more important optimization, since isValidCodePoint
> almost always requires two compares, and this reduces it to one.
> (Still, none of these are really important, and no one will notice)
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
>
> Martin
>
> On Wed, Mar 10, 2010 at 17:59, Martin Buchholz<martinrb at google.com>  wrote:
>    
>> On Wed, Mar 10, 2010 at 09:58, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>>      
>>> Hi Martin,
>>>
>>> there wasn't enough time today, so please wait for tomorrow.
>>>
>>> In brief:
>>> - I wouldn't rename to isBMPCodePoint(), because there are many other names
>>> in Surrogate class that don't sync to Character and and a usages search in
>>> sun.nio.cs.* or where ever else could be omitted. Better add "//  return
>>> Character.isBMPCodePoint(uc);" as hint for the future.
>>> - Thanks for mention me as contributor.
>>> - Doesn't the bug description include the addition of isBMPCodePoint() to
>>> class Character and the equivalent enhancement to isSupplementaryCodePoint()
>>> ?
>>>        
>> Sorry, I should have included the fix to isSupplementaryCodePoint()
>> in the last fix.
>>
>> Here's the next fix:
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
>>
>> 6666666: A better implementation of Character.isSupplementaryCodePoint
>> Summary: Clever bit-twiddling saves a few bytes of machine code
>> Reviewed-by: sherman
>> Contributed-by: Ulf Zibis<Ulf.Zibis at gmx.de>
>> diff --git a/src/share/classes/java/lang/Character.java
>> b/src/share/classes/java/lang/Character.java
>> --- a/src/share/classes/java/lang/Character.java
>> +++ b/src/share/classes/java/lang/Character.java
>> @@ -2693,8 +2693,8 @@
>>       * @since  1.5
>>       */
>>      public static boolean isSupplementaryCodePoint(int codePoint) {
>> -        return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>> -&&  codePoint<= MAX_CODE_POINT;
>> +        int plane = codePoint>>>  16;
>> +        return plane != 0&&  plane<  ((MAX_CODE_POINT + 1)>>>  16);
>>      }
>>
>>      /**
>>
>>      
>
>    


From Ulf.Zibis at gmx.de  Thu Mar 11 18:25:01 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 11 Mar 2010 19:25:01 +0100
Subject: Progress of patches
In-Reply-To: <1ccfd1c11003102042n192885b1ld3859b0f5311e732@mail.gmail.com>
References: <4B96E323.80505@gmx.de>	
	<1ccfd1c11003091710u1e2ec8cevf64110ee3af2d88b@mail.gmail.com>	
	<4B96F476.1060409@gmx.de> <4B97DDB3.2080107@gmx.de>	
	<1ccfd1c11003101759g5f28ec2dhfd1a220ed6758880@mail.gmail.com>
	<1ccfd1c11003102042n192885b1ld3859b0f5311e732@mail.gmail.com>
Message-ID: <4B99357D.5090501@gmx.de>

Am 11.03.2010 05:42, schrieb Martin Buchholz:
> I couldn't resist making a similar change to isValidCodePoint.
>
> @@ -2678,7 +2678,8 @@
>        * @since  1.5
>        */
>       public static boolean isValidCodePoint(int codePoint) {
> -        return codePoint>= MIN_CODE_POINT&&  codePoint<= MAX_CODE_POINT;
> +        int plane = codePoint>>>  16;
> +        return plane<  ((MAX_CODE_POINT + 1)>>>  16);
>       }
>
>       /**
>
> This is a more important optimization, since isValidCodePoint
> almost always requires two compares, and this reduces it to one.
>    

Why isn't this true for isSupplementaryCodePoint() too ?
Particularly there the "cheap" compare against 0 can't be benefited.

> (Still, none of these are really important, and no one will notice)
>    

Maybe in String(int[] codePoints, int offset, int count) or in numerous 
sun.nio.cs charset coders which use these methods consecutive in loop on 
each unicode character.

-Ulf


From Ulf.Zibis at gmx.de  Thu Mar 11 18:32:30 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 11 Mar 2010 19:32:30 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <4B97E3BD.2000901@sun.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com>
Message-ID: <4B99373E.40502@gmx.de>

Sherman,

I know, your time ...

... but maybe someone is needed for sponsor here: 
https://bugs.openjdk.java.net/show_bug.cgi?id=100132

Could you do this?

Much thanks,

-Ulf


Am 10.03.2010 19:23, schrieb Xueming Shen:
> approved.
>
> I don't have a spare ws right now.so please just push, it's almost 
> there:-)
>
> sherman
>
> Martin Buchholz wrote:
>> Here's the proposed fix for
>> 6931812: A better implementation of sun.nio.cs.Surrogate.isBMP(int)
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint/
>>
>> I changed the name to isBMPCodePoint in preparation for moving
>> it to Character.java.
>> (Sherman, perhaps you would like to take on that followon task?)
>>
>> Sherman, please approve.
>>
>> Martin
>>
>> On Sat, Mar 6, 2010 at 13:00, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>> Very fast Sherman, much thanks.
>>>
>>> Could you set the bug to accepted and evaluated, so my patch will 
>>> have a
>>> chance to get into the code base?
>>>
>>> -Ulf
>>>
>>>
>>> Am 03.03.2010 20:11, schrieb Xueming Shen:
>>>> #6931812
>>>>
>>>> Martin Buchholz wrote:
>>>>> Sherman, would you like to file bugs for Ulf's improvements?
>>>>>
>>>>> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>>>>>> Keep in mind that supplementary characters are extremely rare.
>>>>>>>
>>>>>> Yes, but many API's in the JDK are used rarely.
>>>>>> Why should they waste memory footprint / perform bad, 
>>>>>> particularly if it
>>>>>> doesn't cost anything.
>>>>> I admire your perfectionism.
>>>>>
>>>>>>> Therefore the existing implementation
>>>>>>>
>>>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>>> &&  codePoint<= MAX_CODE_POINT;
>>>>>>>
>>>>>>> will almost always perform just one comparison against a constant,
>>>>>>> which is hard to beat.
>>>>>>>
>>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>>> against
>>>>>> only 8 bit values in my case.
>>>>> It's a good point.  In the machine code, shifts are likely to use
>>>>> immediate values, and so will be a small win.
>>>>>
>>>>> int x = codePoint >>> 16;
>>>>> return x != 0 && x < 0x11;
>>>>>
>>>>> (On modern hardware, these optimizations
>>>>> are less valuable than they used to be;
>>>>> ordinary integer arithmetic is almost free)
>>>>>
>>>>> Martin
>
>


From martinrb at google.com  Thu Mar 11 19:38:54 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 11 Mar 2010 11:38:54 -0800
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4B99373E.40502@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
Message-ID: <1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>

Ulf, your changes would be easier to get in
if they were organized as mq patch files that
could be qimported into an existing mq repo.

I've done that below, which includes a subset of
your own proposed changes:

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings/
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/malformed-utf8/

Sherman (or Alan),

please review and/or file bugs for the above changes.

isBMPCodePoint is a spec addition, requiring additional paperwork.

Sherman, you owe me a response to my now-moldy proposed changes to
the UTF-8 charset.

The only controversial change would be the change in behavior in
malformed-utf8, which I can take out.

Martin

On Thu, Mar 11, 2010 at 10:32, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Sherman,
>
> I know, your time ...
>
> ... but maybe someone is needed for sponsor here:
> https://bugs.openjdk.java.net/show_bug.cgi?id=100132
>
> Could you do this?
>
> Much thanks,
>
> -Ulf
>
>
> Am 10.03.2010 19:23, schrieb Xueming Shen:
>>
>> approved.
>>
>> I don't have a spare ws right now.so please just push, it's almost
>> there:-)
>>
>> sherman
>>
>> Martin Buchholz wrote:
>>>
>>> Here's the proposed fix for
>>> 6931812: A better implementation of sun.nio.cs.Surrogate.isBMP(int)
>>>
>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint/
>>>
>>> I changed the name to isBMPCodePoint in preparation for moving
>>> it to Character.java.
>>> (Sherman, perhaps you would like to take on that followon task?)
>>>
>>> Sherman, please approve.
>>>
>>> Martin
>>>
>>> On Sat, Mar 6, 2010 at 13:00, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>
>>>> Very fast Sherman, much thanks.
>>>>
>>>> Could you set the bug to accepted and evaluated, so my patch will have a
>>>> chance to get into the code base?
>>>>
>>>> -Ulf
>>>>
>>>>
>>>> Am 03.03.2010 20:11, schrieb Xueming Shen:
>>>>>
>>>>> #6931812
>>>>>
>>>>> Martin Buchholz wrote:
>>>>>>
>>>>>> Sherman, would you like to file bugs for Ulf's improvements?
>>>>>>
>>>>>> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>>>>
>>>>>>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>>>>>>>
>>>>>>>> Keep in mind that supplementary characters are extremely rare.
>>>>>>>>
>>>>>>> Yes, but many API's in the JDK are used rarely.
>>>>>>> Why should they waste memory footprint / perform bad, particularly if
>>>>>>> it
>>>>>>> doesn't cost anything.
>>>>>>
>>>>>> I admire your perfectionism.
>>>>>>
>>>>>>>> Therefore the existing implementation
>>>>>>>>
>>>>>>>> ?return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>>>> && ?codePoint<= MAX_CODE_POINT;
>>>>>>>>
>>>>>>>> will almost always perform just one comparison against a constant,
>>>>>>>> which is hard to beat.
>>>>>>>>
>>>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>>>> against
>>>>>>> only 8 bit values in my case.
>>>>>>
>>>>>> It's a good point. ?In the machine code, shifts are likely to use
>>>>>> immediate values, and so will be a small win.
>>>>>>
>>>>>> int x = codePoint >>> 16;
>>>>>> return x != 0 && x < 0x11;
>>>>>>
>>>>>> (On modern hardware, these optimizations
>>>>>> are less valuable than they used to be;
>>>>>> ordinary integer arithmetic is almost free)
>>>>>>
>>>>>> Martin
>>
>>
>
>


From martinrb at google.com  Thu Mar 11 19:56:05 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 11 Mar 2010 11:56:05 -0800
Subject: Progress of patches
In-Reply-To: <4B99357D.5090501@gmx.de>
References: <4B96E323.80505@gmx.de>
	<1ccfd1c11003091710u1e2ec8cevf64110ee3af2d88b@mail.gmail.com>
	<4B96F476.1060409@gmx.de> <4B97DDB3.2080107@gmx.de>
	<1ccfd1c11003101759g5f28ec2dhfd1a220ed6758880@mail.gmail.com>
	<1ccfd1c11003102042n192885b1ld3859b0f5311e732@mail.gmail.com>
	<4B99357D.5090501@gmx.de>
Message-ID: <1ccfd1c11003111156u40854e62ybba922383d636cb9@mail.gmail.com>

On Thu, Mar 11, 2010 at 10:25, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 11.03.2010 05:42, schrieb Martin Buchholz:
>>
>> I couldn't resist making a similar change to isValidCodePoint.
>>
>> @@ -2678,7 +2678,8 @@
>> ? ? ? * @since ?1.5
>> ? ? ? */
>> ? ? ?public static boolean isValidCodePoint(int codePoint) {
>> - ? ? ? ?return codePoint>= MIN_CODE_POINT&& ?codePoint<= MAX_CODE_POINT;
>> + ? ? ? ?int plane = codePoint>>> ?16;
>> + ? ? ? ?return plane< ?((MAX_CODE_POINT + 1)>>> ?16);
>> ? ? ?}
>>
>> ? ? ?/**
>>
>> This is a more important optimization, since isValidCodePoint
>> almost always requires two compares, and this reduces it to one.
>>
>
> Why isn't this true for isSupplementaryCodePoint() too ?
> Particularly there the "cheap" compare against 0 can't be benefited.

Because almost all code points are actually BMP,
the naive implementation of isValidCodePoint will
almost always require one more branch than
isSupplementaryCodePoint,
making it more valuable to optimize.

>> (Still, none of these are really important, and no one will notice)

Martin


From Ulf.Zibis at gmx.de  Thu Mar 11 20:19:33 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 11 Mar 2010 21:19:33 +0100
Subject: Progress of patches
In-Reply-To: <1ccfd1c11003111156u40854e62ybba922383d636cb9@mail.gmail.com>
References: <4B96E323.80505@gmx.de>	
	<1ccfd1c11003091710u1e2ec8cevf64110ee3af2d88b@mail.gmail.com>	
	<4B96F476.1060409@gmx.de> <4B97DDB3.2080107@gmx.de>	
	<1ccfd1c11003101759g5f28ec2dhfd1a220ed6758880@mail.gmail.com>	
	<1ccfd1c11003102042n192885b1ld3859b0f5311e732@mail.gmail.com>	
	<4B99357D.5090501@gmx.de>
	<1ccfd1c11003111156u40854e62ybba922383d636cb9@mail.gmail.com>
Message-ID: <4B995055.60700@gmx.de>

Am 11.03.2010 20:56, schrieb Martin Buchholz:
>
>> Why isn't this true for isSupplementaryCodePoint() too ?
>> Particularly there the "cheap" compare against 0 can't be benefited.
>>      
> Because almost all code points are actually BMP,
> the naive implementation of isValidCodePoint will
> almost always require one more branch than
> isSupplementaryCodePoint,
> making it more valuable to optimize.

Thanks, now it's clear what you meant. My disadvantage not being so 
familiar with english here in Germany.

-Ulf


From Ulf.Zibis at gmx.de  Thu Mar 11 21:14:10 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 11 Mar 2010 22:14:10 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
Message-ID: <4B995D22.2020507@gmx.de>

Am 11.03.2010 20:38, schrieb Martin Buchholz:
> Ulf, your changes would be easier to get in
> if they were organized as mq patch files that
> could be qimported into an existing mq repo.
>

To be honest, I never heard about mq. Can you point me to some docs please?

> I've done that below, which includes a subset of
> your own proposed changes:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/
>

- Maybe better:  "... using a single {@code char}".
- Why don't you like using the new isBMPCodePoint() for isSupplementaryCodePoint() and 
toUpperCaseCharArray() ?
- Same shift magic would enhance isISOControl(), isHighSurrogate(), isLowSurrogate(), in particular 
if latter occur consecutive.
   8-bit shift + compare would allow HotSpot to compile to smart 1-byte immediate op-codes.
- Don't you think my notes on validity are worth to add. (or separate bug ?)
- Changing ch <= MAX_SURROGATE to ch < MAX_SURROGATE + 1 would allow HotSpot compiler to optimize 1 
branch if those methods are used consecutive.
- And at last, I would like to make the constants complete (= adding MAX_SUPPLEMENTARY_CODE_POINT).

> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings/
>

Remembers me that some months ago I prepared a beautified version of Character's source (things like 
above, replacing <code> against {@code}, indentation inconsistencies etc.) Would there be interest 
to provide such a patch ?

> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/malformed-utf8/
>

In encodeBufferLoop() you could use putChar(), putInt() instead put(). Should perform better.

> Sherman (or Alan),
>
> please review and/or file bugs for the above changes.
>
> isBMPCodePoint is a spec addition, requiring additional paperwork.
>
> Sherman, you owe me a response to my now-moldy proposed changes to
> the UTF-8 charset.
>
> The only controversial change would be the change in behavior in
> malformed-utf8, which I can take out.
>

This remembers me at some thoughts. To be *exact* I think malformed should be returned for all 
codes, which are invalid in the regarding character set. So first validate for unmappable and second 
for invalid (=malformed). Doesn't cost any performance in looping mappable and valid characters, but 
little more effort after the loop is interrupted to form the right CoderResult.


-Ulf


From Xueming.Shen at Sun.COM  Thu Mar 11 21:24:44 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Thu, 11 Mar 2010 13:24:44 -0800
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
Message-ID: <4B995F9C.3070705@sun.com>

Martin, Ulf

Following bug/rfs have been filed.

6934265 Add public method Character.isBMPCodePoint
6934268 Better implementation of Character.isValidCodePoint and 
isSupplementaryCodePoint()
6934270: Remove javac warnings from Character.java
6934271: Better handling of longer utf-8 sequences

Masayoshi, Alan would you please help review the corresponding CCC for 
6934265 at
http://ccc.sfbay.sun.com/6934265

Martin, don't touch the utf-8 malformed issue for now, and incompatible 
change in UTF-8
is A issue.

sherman

Martin Buchholz wrote:
> Ulf, your changes would be easier to get in
> if they were organized as mq patch files that
> could be qimported into an existing mq repo.
>
> I've done that below, which includes a subset of
> your own proposed changes:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings/
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/malformed-utf8/
>
> Sherman (or Alan),
>
> please review and/or file bugs for the above changes.
>
> isBMPCodePoint is a spec addition, requiring additional paperwork.
>
> Sherman, you owe me a response to my now-moldy proposed changes to
> the UTF-8 charset.
>
> The only controversial change would be the change in behavior in
> malformed-utf8, which I can take out.
>
> Martin
>
> On Thu, Mar 11, 2010 at 10:32, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>   
>> Sherman,
>>
>> I know, your time ...
>>
>> ... but maybe someone is needed for sponsor here:
>> https://bugs.openjdk.java.net/show_bug.cgi?id=100132
>>
>> Could you do this?
>>
>> Much thanks,
>>
>> -Ulf
>>
>>
>> Am 10.03.2010 19:23, schrieb Xueming Shen:
>>     
>>> approved.
>>>
>>> I don't have a spare ws right now.so please just push, it's almost
>>> there:-)
>>>
>>> sherman
>>>
>>> Martin Buchholz wrote:
>>>       
>>>> Here's the proposed fix for
>>>> 6931812: A better implementation of sun.nio.cs.Surrogate.isBMP(int)
>>>>
>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint/
>>>>
>>>> I changed the name to isBMPCodePoint in preparation for moving
>>>> it to Character.java.
>>>> (Sherman, perhaps you would like to take on that followon task?)
>>>>
>>>> Sherman, please approve.
>>>>
>>>> Martin
>>>>
>>>> On Sat, Mar 6, 2010 at 13:00, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>         
>>>>> Very fast Sherman, much thanks.
>>>>>
>>>>> Could you set the bug to accepted and evaluated, so my patch will have a
>>>>> chance to get into the code base?
>>>>>
>>>>> -Ulf
>>>>>
>>>>>
>>>>> Am 03.03.2010 20:11, schrieb Xueming Shen:
>>>>>           
>>>>>> #6931812
>>>>>>
>>>>>> Martin Buchholz wrote:
>>>>>>             
>>>>>>> Sherman, would you like to file bugs for Ulf's improvements?
>>>>>>>
>>>>>>> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>>>>               
>>>>>>>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>>>>>>>                 
>>>>>>>>> Keep in mind that supplementary characters are extremely rare.
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>> Yes, but many API's in the JDK are used rarely.
>>>>>>>> Why should they waste memory footprint / perform bad, particularly if
>>>>>>>> it
>>>>>>>> doesn't cost anything.
>>>>>>>>                 
>>>>>>> I admire your perfectionism.
>>>>>>>
>>>>>>>               
>>>>>>>>> Therefore the existing implementation
>>>>>>>>>
>>>>>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>>>>> &&  codePoint<= MAX_CODE_POINT;
>>>>>>>>>
>>>>>>>>> will almost always perform just one comparison against a constant,
>>>>>>>>> which is hard to beat.
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>>>>> against
>>>>>>>> only 8 bit values in my case.
>>>>>>>>                 
>>>>>>> It's a good point.  In the machine code, shifts are likely to use
>>>>>>> immediate values, and so will be a small win.
>>>>>>>
>>>>>>> int x = codePoint >>> 16;
>>>>>>> return x != 0 && x < 0x11;
>>>>>>>
>>>>>>> (On modern hardware, these optimizations
>>>>>>> are less valuable than they used to be;
>>>>>>> ordinary integer arithmetic is almost free)
>>>>>>>
>>>>>>> Martin
>>>>>>>               
>>>       
>>     


From Xueming.Shen at Sun.COM  Thu Mar 11 21:45:37 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Thu, 11 Mar 2010 13:45:37 -0800
Subject: Codereview needed for #6929479
In-Reply-To: <4B990D93.9050806@sun.com>
References: <4B86CAE8.3080008@sun.com> <4B86D43D.4000002@sun.com>
	<4B86D6C4.4080600@sun.com> <4B86D763.8080603@sun.com>
	<4B989210.4020303@sun.com> <4B990D93.9050806@sun.com>
Message-ID: <4B996481.7000102@sun.com>

Alan Bateman wrote:
> Xueming Shen wrote:
>> Alan,
>>
>> webrev has been updated to use the sun.zip.disableMemoryMapping
>>
>> http://cr.openjdk.java.net/~sherman/6929479/webrev
>>
>> Please review.
>>
>> Thanks,
>> Sherman
> I agree it's a better name.  In ZipFile it would be good to put a 
> comment at the initialization so that the reader understands what this 
> property is about. Minor nit in zip_util.c at L805 is that it looks 
> like the indenting it out by one.  In any case, this will be a useful 
> debugging option for the next time that someone steps on their own feet.
>
> -Alan
>
Thanks Alan.

One comment has been added into ZipFile.
I counted the space in zip_util.c/L805:-) it appears the indenting is 
correct. maybe because of font setting?

Sherman


From xueming.shen at sun.com  Thu Mar 11 22:13:54 2010
From: xueming.shen at sun.com (xueming.shen at sun.com)
Date: Thu, 11 Mar 2010 22:13:54 +0000
Subject: hg: jdk7/tl/jdk: 6929479: Add a system property
	sun.zip.disableMemoryMapping to disable mmap use in ZipFile
Message-ID: <20100311221413.1ACC14478B@hg.openjdk.java.net>

Changeset: ee385b4e2ffb
Author:    sherman
Date:      2010-03-11 14:06 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/ee385b4e2ffb

6929479: Add a system property sun.zip.disableMemoryMapping to disable mmap use in ZipFile
Summary: system property sun.zip.disableMemoryMapping to disable mmap use
Reviewed-by: alanb

! src/share/classes/java/util/zip/ZipFile.java
! src/share/native/java/util/zip/ZipFile.c
! src/share/native/java/util/zip/zip_util.c
! src/share/native/java/util/zip/zip_util.h


From martinrb at google.com  Fri Mar 12 01:46:37 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 11 Mar 2010 17:46:37 -0800
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4B995D22.2020507@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995D22.2020507@gmx.de>
Message-ID: <1ccfd1c11003111746o5dcd81d1veadb0c6a4882df65@mail.gmail.com>

On Thu, Mar 11, 2010 at 13:14, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 11.03.2010 20:38, schrieb Martin Buchholz:
>>
>> Ulf, your changes would be easier to get in
>> if they were organized as mq patch files that
>> could be qimported into an existing mq repo.
>>
>
> To be honest, I never heard about mq. Can you point me to some docs please?

http://mercurial.selenic.com/wiki/MqExtension
http://hgbook.red-bean.com/read/managing-change-with-mercurial-queues.html

>> I've done that below, which includes a subset of
>> your own proposed changes:
>>
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/
>>
>
> - Maybe better: ?"... using a single {@code char}".
> - Why don't you like using the new isBMPCodePoint() for
> isSupplementaryCodePoint() and toUpperCaseCharArray() ?
> - Same shift magic would enhance isISOControl(),

I propose the following small improvement on your own
version of isISOControl:

    public static boolean isISOControl(int codePoint) {
        // Optimized form of:
        //     (codePoint >= 0x0000 && codePoint <= 0x001F) ||
        //     (codePoint >= 0x007F && codePoint <= 0x009F);
        return codePoint <= 0x009F &&
            (codePoint >= 0x007F || (codePoint >>> 5 == 0));
    }

Because non-ASCII chars get away with only one comparison.

 isHighSurrogate(),
> isLowSurrogate(), in particular if latter occur consecutive.
> ?8-bit shift + compare would allow HotSpot to compile to smart 1-byte
> immediate op-codes.

Alright, you've talked me into it,
I can't resist your love of micro-optimizations.

More later.

Martin


From Kelly.Ohair at Sun.COM  Fri Mar 12 04:59:09 2010
From: Kelly.Ohair at Sun.COM (Kelly O'Hair)
Date: Thu, 11 Mar 2010 20:59:09 -0800
Subject: TEST: java/nio/channels/AsynchronousSocketChannel/Basic.java
Message-ID: <D8973F65-6FDD-42D3-BB58-A9F6DB89D280@sun.com>


I'm having problems with this test on Solaris 10 X86 and Fedora 9 32bit.
Ring any bells?
-kto

--------------------------------------------------
TEST: java/nio/channels/AsynchronousSocketChannel/Basic.java
JDK under test: (/tmp/jprt/P1/T/173102.ohair/testproduct/ 
solaris_i586_5.10-product)
openjdk version "1.7.0-2010-03-11-173102.ohair.jdk"
OpenJDK Runtime Environment (build 1.7.0-2010-03-11-173102.ohair.jdk- 
jprtadm_2010_03_11_09_40-b00)
Java HotSpot(TM) Server VM (build 17.0-b10, mixed mode)

ACTION: build -- Passed. Build successful
REASON: Named class compiled on demand
TIME:   0.89 seconds
messages:
command: build Basic
reason: Named class compiled on demand
elapsed time (seconds): 0.89

ACTION: compile -- Passed. Compilation successful
REASON: .class file out of date or does not exist
TIME:   0.89 seconds
messages:
command: compile /tmp/jprt/P1/T/173102.ohair/source/test/java/nio/ 
channels/AsynchronousSocketChannel/Basic.java
reason: .class file out of date or does not exist
elapsed time (seconds): 0.89
STDOUT:
STDERR:

ACTION: main -- Failed. Execution failed: `main' threw exception:  
java.lang.RuntimeException: Should not connect
REASON: User specified action: run main/timeout=600 Basic
TIME:   4.007 seconds
messages:
command: main Basic
reason: User specified action: run main/timeout=600 Basic
elapsed time (seconds): 4.007
STDOUT:
-- bind --
-- socket options --
-- connect --
-- connect to non-existent host --
-- asynchronous close when connecting --
STDERR:
java.lang.RuntimeException: Should not connect
	at Basic.testCloseWhenPending(Basic.java:238)
	at Basic.main(Basic.java:46)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at  
sun 
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 
57)
	at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
43)
	at java.lang.reflect.Method.invoke(Method.java:613)
	at com.sun.javatest.regtest.MainWrapper 
$MainThread.run(MainWrapper.java:94)
	at java.lang.Thread.run(Thread.java:717)

JavaTest Message: Test threw exception: java.lang.RuntimeException:  
Should not connect
JavaTest Message: shutting down test

STATUS:Failed.`main' threw exception: java.lang.RuntimeException:  
Should not connect

TEST RESULT: Failed. Execution failed: `main' threw exception:  
java.lang.RuntimeException: Should not connect
--------------------------------------------------

--------------------------------------------------
TEST: java/nio/channels/AsynchronousSocketChannel/Basic.java
JDK under test: (/tmp/jprt/P1/T/173102.ohair/testproduct/ 
linux_i586_2.6-product)
openjdk version "1.7.0-2010-03-11-173102.ohair.jdk"
OpenJDK Runtime Environment (build 1.7.0-2010-03-11-173102.ohair.jdk- 
jprtadm_2010_03_11_09_41-b00)
Java HotSpot(TM) Server VM (build 17.0-b10, mixed mode)

ACTION: build -- Passed. Build successful
REASON: Named class compiled on demand
TIME:   0.85 seconds
messages:
command: build Basic
reason: Named class compiled on demand
elapsed time (seconds): 0.85

ACTION: compile -- Passed. Compilation successful
REASON: .class file out of date or does not exist
TIME:   0.85 seconds
messages:
command: compile /tmp/jprt/P1/T/173102.ohair/source/test/java/nio/ 
channels/AsynchronousSocketChannel/Basic.java
reason: .class file out of date or does not exist
elapsed time (seconds): 0.85
STDOUT:
STDERR:

ACTION: main -- Failed. Execution failed: `main' threw exception:  
java.lang.RuntimeException: Connection should not be established
REASON: User specified action: run main/timeout=600 Basic
TIME:   10.574 seconds
messages:
command: main Basic
reason: User specified action: run main/timeout=600 Basic
elapsed time (seconds): 10.574
STDOUT:
-- bind --
-- socket options --
-- connect --
-- connect to non-existent host --
STDERR:
java.lang.RuntimeException: Connection should not be established
	at Basic.testConnect(Basic.java:206)
	at Basic.main(Basic.java:45)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at  
sun 
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 
57)
	at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
43)
	at java.lang.reflect.Method.invoke(Method.java:613)
	at com.sun.javatest.regtest.MainWrapper 
$MainThread.run(MainWrapper.java:94)
	at java.lang.Thread.run(Thread.java:717)

JavaTest Message: Test threw exception: java.lang.RuntimeException:  
Connection should not be established
JavaTest Message: shutting down test

STATUS:Failed.`main' threw exception: java.lang.RuntimeException:  
Connection should not be established

TEST RESULT: Failed. Execution failed: `main' threw exception:  
java.lang.RuntimeException: Connection should not be established
--------------------------------------------------


From Alan.Bateman at Sun.COM  Fri Mar 12 08:06:55 2010
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Fri, 12 Mar 2010 08:06:55 +0000
Subject: TEST: java/nio/channels/AsynchronousSocketChannel/Basic.java
In-Reply-To: <D8973F65-6FDD-42D3-BB58-A9F6DB89D280@sun.com>
References: <D8973F65-6FDD-42D3-BB58-A9F6DB89D280@sun.com>
Message-ID: <4B99F61F.6060705@sun.com>

Kelly O'Hair wrote:
>
> I'm having problems with this test on Solaris 10 X86 and Fedora 9 32bit.
> Ring any bells?
> -kto
I haven't seen this failure but looking at the test now, the connect can 
fail immediately which would cause both of the test failures - can you 
create a bug and I'll fix this.

-Alan.


From Ulf.Zibis at gmx.de  Fri Mar 12 13:04:39 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 12 Mar 2010 14:04:39 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003111746o5dcd81d1veadb0c6a4882df65@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003111746o5dcd81d1veadb0c6a4882df65@mail.gmail.com>
Message-ID: <4B9A3BE7.4090502@gmx.de>

Am 12.03.2010 02:46, schrieb Martin Buchholz:
> http://mercurial.selenic.com/wiki/MqExtension
> http://hgbook.red-bean.com/read/managing-change-with-mercurial-queues.html
>    

Ah, I see mq is part of hg. Another thing to learn, but sounds good. 
Unfortunately there seems no GUI for it. I run TortoiseHG on my Windows.

>
> I propose the following small improvement on your own
> version of isISOControl:
>
>      public static boolean isISOControl(int codePoint) {
>          // Optimized form of:
>          //     (codePoint>= 0x0000&&  codePoint<= 0x001F) ||
>          //     (codePoint>= 0x007F&&  codePoint<= 0x009F);
>          return codePoint<= 0x009F&&
>              (codePoint>= 0x007F || (codePoint>>>  5 == 0));
>      }
>
> Because non-ASCII chars get away with only one comparison.
>    

+1 thanks.
Because here we are talking about ASCII values, I would prefer 2-digit 
values or complete code points e.g. 0x0000007F.

>
> Alright, you've talked me into it,
> I can't resist your love of micro-optimizations.
>    

Is that sentence correct english grammar? I'm afraid to misunderstand.

By the way, I've filed some bugs against HotSpot to optimize those cases:
6932837 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6932837> - 
Better use unsigned jump if one of the range limits is 0
6932855 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6932855> - 
Save superfluous CMP instruction from while loop
6933327 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933327> - 
Use shifted addressing modes instead of shift instuctions
6933324 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933324> - 
Always inline methods, which have only 1 call site

If they would be accepted and fixed, some of our twiddling would become 
superfluous, at least using c2, but maybe not for interpreter and c1.

-Ulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100312/257abcc9/attachment.html>

From Ulf.Zibis at gmx.de  Fri Mar 12 16:41:07 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 12 Mar 2010 17:41:07 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
Message-ID: <4B9A6EA3.7070603@gmx.de>

Hi Martin,

is that 6666666 a fake bug id?
I still can't see it in the public bugparade:

    * This bug is not available.

      More information is available at
      -http://developers.sun.com/resources/bugsFAQ.html#s4q2

-Ulf


Am 11.03.2010 20:38, schrieb Martin Buchholz:
> Ulf, your changes would be easier to get in
> if they were organized as mq patch files that
> could be qimported into an existing mq repo.
>
> I've done that below, which includes a subset of
> your own proposed changes:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings/
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/malformed-utf8/
>
> Sherman (or Alan),
>
> please review and/or file bugs for the above changes.
>
> isBMPCodePoint is a spec addition, requiring additional paperwork.
>
> Sherman, you owe me a response to my now-moldy proposed changes to
> the UTF-8 charset.
>
> The only controversial change would be the change in behavior in
> malformed-utf8, which I can take out.
>
> Martin
>
>    
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100312/8433afa3/attachment.html>

From Kelly.Ohair at Sun.COM  Fri Mar 12 17:15:33 2010
From: Kelly.Ohair at Sun.COM (Kelly O'Hair)
Date: Fri, 12 Mar 2010 09:15:33 -0800
Subject: TEST: java/nio/channels/AsynchronousSocketChannel/Basic.java
In-Reply-To: <4B99F61F.6060705@sun.com>
References: <D8973F65-6FDD-42D3-BB58-A9F6DB89D280@sun.com>
	<4B99F61F.6060705@sun.com>
Message-ID: <599F4086-783E-4C05-9BEA-262758B32D35@Sun.COM>

Filed bug 6934585

-kto

On Mar 12, 2010, at 12:06 AM, Alan Bateman wrote:

> Kelly O'Hair wrote:
>>
>> I'm having problems with this test on Solaris 10 X86 and Fedora 9  
>> 32bit.
>> Ring any bells?
>> -kto
> I haven't seen this failure but looking at the test now, the connect  
> can fail immediately which would cause both of the test failures -  
> can you create a bug and I'll fix this.
>
> -Alan.


From jonathan.gibbons at sun.com  Fri Mar 12 20:01:35 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Fri, 12 Mar 2010 20:01:35 +0000
Subject: hg: jdk7/tl/langtools: 6934224: update langtools/test/Makefile
Message-ID: <20100312200137.D4E54448C3@hg.openjdk.java.net>

Changeset: f856c0942c06
Author:    jjg
Date:      2010-03-12 12:00 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/f856c0942c06

6934224: update langtools/test/Makefile
Reviewed-by: ohair

! make/jprt.properties
! test/Makefile


From kelly.ohair at sun.com  Fri Mar 12 20:15:43 2010
From: kelly.ohair at sun.com (kelly.ohair at sun.com)
Date: Fri, 12 Mar 2010 20:15:43 +0000
Subject: hg: jdk7/tl/jdk: 2 new changesets
Message-ID: <20100312201621.2443B448C8@hg.openjdk.java.net>

Changeset: bf6eb240e718
Author:    ohair
Date:      2010-03-12 09:03 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/bf6eb240e718

6933294: Fix some test/Makefile issues around Linux ARCH settings, better defaults
Reviewed-by: jjg

! test/Makefile
! test/ProblemList.txt

Changeset: cda90ceb7176
Author:    ohair
Date:      2010-03-12 09:06 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/cda90ceb7176

Merge

! test/ProblemList.txt


From martinrb at google.com  Fri Mar 12 23:04:06 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 12 Mar 2010 15:04:06 -0800
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4B995D22.2020507@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995D22.2020507@gmx.de>
Message-ID: <1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>

On Thu, Mar 11, 2010 at 13:14, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 11.03.2010 20:38, schrieb Martin Buchholz:
> - Maybe better: ?"... using a single {@code char}".

Done.

> - Why don't you like using the new isBMPCodePoint() for
> isSupplementaryCodePoint() and toUpperCaseCharArray() ?

I now use it for the assert in toUpperCaseCharArray()

> - Same shift magic would enhance isISOControl(), isHighSurrogate(),
> isLowSurrogate(), in particular if latter occur consecutive.

isISOControl - yes, others - I am not convinced.

> ?8-bit shift + compare would allow HotSpot to compile to smart 1-byte
> immediate op-codes.
> - Don't you think my notes on validity are worth to add. (or separate bug ?)

I agree something could be done here - separate bug.

> - Changing ch <= MAX_SURROGATE to ch < MAX_SURROGATE + 1 would allow HotSpot
> compiler to optimize 1 branch if those methods are used consecutive.

Done.

> - And at last, I would like to make the constants complete (= adding
> MAX_SUPPLEMENTARY_CODE_POINT).

I have no objection to adding those, but I am not in favor either.
You'll need to convince someone else.

>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings/
>>
>
> Remembers me that some months ago I prepared a beautified version of
> Character's source (things like above, replacing <code> against {@code},
> indentation inconsistencies etc.) Would there be interest to provide such a
> patch ?

Please provide URL of patch.

>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/malformed-utf8/

> In encodeBufferLoop() you could use putChar(), putInt() instead put().
> Should perform better.

I'm not convinced.  You would need to assemble bytes into an
int, and then break them apart into bytes on the other side?

Martin


From martinrb at google.com  Fri Mar 12 23:13:25 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 12 Mar 2010 15:13:25 -0800
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4B9A3BE7.4090502@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B8EB46C.1010208@sun.com>
	<4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003111746o5dcd81d1veadb0c6a4882df65@mail.gmail.com>
	<4B9A3BE7.4090502@gmx.de>
Message-ID: <1ccfd1c11003121513g5a5c4a43xc44a79da36975ab7@mail.gmail.com>

On Fri, Mar 12, 2010 at 05:04, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 12.03.2010 02:46, schrieb Martin Buchholz:
>     public static boolean isISOControl(int codePoint) {
>         // Optimized form of:
>         //     (codePoint >= 0x0000 && codePoint <= 0x001F) ||
>         //     (codePoint >= 0x007F && codePoint <= 0x009F);
>         return codePoint <= 0x009F &&
>             (codePoint >= 0x007F || (codePoint >>> 5 == 0));
>     }
>
> Because non-ASCII chars get away with only one comparison.
>
>
> +1 thanks.
> Because here we are talking about ASCII values, I would prefer 2-digit
> values or complete code points e.g. 0x0000007F.

Good idea.  Done.

>
> Alright, you've talked me into it,
> I can't resist your love of micro-optimizations.
>
>
> Is that sentence correct english grammar? I'm afraid to misunderstand.

, ==> .

> By the way, I've filed some bugs against HotSpot to optimize those cases:
> 6932837 - Better use unsigned jump if one of the range limits is 0
> 6932855 - Save superfluous CMP instruction from while loop
> 6933327 - Use shifted addressing modes instead of shift instuctions
> 6933324 - Always inline methods, which have only 1 call site
>
> If they would be accepted and fixed, some of our twiddling would become
> superfluous, at least using c2, but maybe not for interpreter and c1.

Of course, we often write hotspot-optimized code, but in general we should
try to write "good" code that any VM could love.

Martin


From jonathan.gibbons at sun.com  Fri Mar 12 23:25:00 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Fri, 12 Mar 2010 23:25:00 +0000
Subject: hg: jdk7/tl: 6934712: run langtools jtreg tests from top level
	test/Makefile
Message-ID: <20100312232500.DB883448F5@hg.openjdk.java.net>

Changeset: bbd817429100
Author:    jjg
Date:      2010-03-12 15:22 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/bbd817429100

6934712: run langtools jtreg tests from top level test/Makefile
Reviewed-by: ohair

! test/Makefile


From martinrb at google.com  Fri Mar 12 23:29:53 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 12 Mar 2010 15:29:53 -0800
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4B995F9C.3070705@sun.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995F9C.3070705@sun.com>
Message-ID: <1ccfd1c11003121529r22651bfcnfca6435311d707a6@mail.gmail.com>

OK, next round of review.

I changed my UTF-8 changes to be behavior-preserving,
removing any hint of controversy, and renamed the patch
to "utf8-twiddling".

I got Ulf in my head, and can't stop micro-optimizing.
I added a new micro-optimizing patch for Bits.java.
Please file a bug.

6934268: Better implementation of Character.isValidCodePoint and
isSupplementaryCodePoint()
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint
6934265: Add public method Character.isBMPCodePoint
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
6934270: Remove javac warnings from Character.java
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings
6934271: Better handling of longer utf-8 sequences
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/utf8-twiddling
6666666: Optimize bit-twiddling in Bits.java
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/qtip tip Bits.java

Now I need to go off to my micro-optimizers-anonymous meeting.

Martin

On Thu, Mar 11, 2010 at 13:24, Xueming Shen <Xueming.Shen at sun.com> wrote:
> Martin, Ulf
>
> Following bug/rfs have been filed.
>
> 6934265 Add public method Character.isBMPCodePoint
> 6934268 Better implementation of Character.isValidCodePoint and
> isSupplementaryCodePoint()
> 6934270: Remove javac warnings from Character.java
> 6934271: Better handling of longer utf-8 sequences
>
> Masayoshi, Alan would you please help review the corresponding CCC for
> 6934265 at
> http://ccc.sfbay.sun.com/6934265
>
> Martin, don't touch the utf-8 malformed issue for now, and incompatible
> change in UTF-8
> is A issue.
>
> sherman
>
> Martin Buchholz wrote:
>>
>> Ulf, your changes would be easier to get in
>> if they were organized as mq patch files that
>> could be qimported into an existing mq repo.
>>
>> I've done that below, which includes a subset of
>> your own proposed changes:
>>
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings/
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/malformed-utf8/
>>
>> Sherman (or Alan),
>>
>> please review and/or file bugs for the above changes.
>>
>> isBMPCodePoint is a spec addition, requiring additional paperwork.
>>
>> Sherman, you owe me a response to my now-moldy proposed changes to
>> the UTF-8 charset.
>>
>> The only controversial change would be the change in behavior in
>> malformed-utf8, which I can take out.
>>
>> Martin
>>
>> On Thu, Mar 11, 2010 at 10:32, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>
>>>
>>> Sherman,
>>>
>>> I know, your time ...
>>>
>>> ... but maybe someone is needed for sponsor here:
>>> https://bugs.openjdk.java.net/show_bug.cgi?id=100132
>>>
>>> Could you do this?
>>>
>>> Much thanks,
>>>
>>> -Ulf
>>>
>>>
>>> Am 10.03.2010 19:23, schrieb Xueming Shen:
>>>
>>>>
>>>> approved.
>>>>
>>>> I don't have a spare ws right now.so please just push, it's almost
>>>> there:-)
>>>>
>>>> sherman
>>>>
>>>> Martin Buchholz wrote:
>>>>
>>>>>
>>>>> Here's the proposed fix for
>>>>> 6931812: A better implementation of sun.nio.cs.Surrogate.isBMP(int)
>>>>>
>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint/
>>>>>
>>>>> I changed the name to isBMPCodePoint in preparation for moving
>>>>> it to Character.java.
>>>>> (Sherman, perhaps you would like to take on that followon task?)
>>>>>
>>>>> Sherman, please approve.
>>>>>
>>>>> Martin
>>>>>
>>>>> On Sat, Mar 6, 2010 at 13:00, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>>
>>>>>>
>>>>>> Very fast Sherman, much thanks.
>>>>>>
>>>>>> Could you set the bug to accepted and evaluated, so my patch will have
>>>>>> a
>>>>>> chance to get into the code base?
>>>>>>
>>>>>> -Ulf
>>>>>>
>>>>>>
>>>>>> Am 03.03.2010 20:11, schrieb Xueming Shen:
>>>>>>
>>>>>>>
>>>>>>> #6931812
>>>>>>>
>>>>>>> Martin Buchholz wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Sherman, would you like to file bugs for Ulf's improvements?
>>>>>>>>
>>>>>>>> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Keep in mind that supplementary characters are extremely rare.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, but many API's in the JDK are used rarely.
>>>>>>>>> Why should they waste memory footprint / perform bad, particularly
>>>>>>>>> if
>>>>>>>>> it
>>>>>>>>> doesn't cost anything.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I admire your perfectionism.
>>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Therefore the existing implementation
>>>>>>>>>>
>>>>>>>>>> ?return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>>>>>> && ?codePoint<= MAX_CODE_POINT;
>>>>>>>>>>
>>>>>>>>>> will almost always perform just one comparison against a constant,
>>>>>>>>>> which is hard to beat.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>>>>>> against
>>>>>>>>> only 8 bit values in my case.
>>>>>>>>>
>>>>>>>>
>>>>>>>> It's a good point. ?In the machine code, shifts are likely to use
>>>>>>>> immediate values, and so will be a small win.
>>>>>>>>
>>>>>>>> int x = codePoint >>> 16;
>>>>>>>> return x != 0 && x < 0x11;
>>>>>>>>
>>>>>>>> (On modern hardware, these optimizations
>>>>>>>> are less valuable than they used to be;
>>>>>>>> ordinary integer arithmetic is almost free)
>>>>>>>>
>>>>>>>> Martin
>>>>>>>>
>>>>
>>>>
>>>
>>>
>
>


From kelly.ohair at sun.com  Sat Mar 13 01:47:22 2010
From: kelly.ohair at sun.com (kelly.ohair at sun.com)
Date: Sat, 13 Mar 2010 01:47:22 +0000
Subject: hg: jdk7/tl: 6934759: Add langtools testing to jprt control builds
Message-ID: <20100313014722.B3FF544918@hg.openjdk.java.net>

Changeset: c60ed0f6d91a
Author:    ohair
Date:      2010-03-12 17:44 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/c60ed0f6d91a

6934759: Add langtools testing to jprt control builds
Reviewed-by: jjg

! make/jprt.properties


From Xueming.Shen at Sun.COM  Tue Mar 16 06:26:42 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Mon, 15 Mar 2010 22:26:42 -0800
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003121529r22651bfcnfca6435311d707a6@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995F9C.3070705@sun.com>
	<1ccfd1c11003121529r22651bfcnfca6435311d707a6@mail.gmail.com>
Message-ID: <4B9F24A2.2070300@sun.com>

CR 6935172 Created, P4 java/classes_io Optimize bit-twiddling in Bits.java

Can I assume the webrev is

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Bits.java

-Sherman

Martin Buchholz wrote:
> OK, next round of review.
>
> I changed my UTF-8 changes to be behavior-preserving,
> removing any hint of controversy, and renamed the patch
> to "utf8-twiddling".
>
> I got Ulf in my head, and can't stop micro-optimizing.
> I added a new micro-optimizing patch for Bits.java.
> Please file a bug.
>
> 6934268: Better implementation of Character.isValidCodePoint and
> isSupplementaryCodePoint()
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint
> 6934265: Add public method Character.isBMPCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
> 6934270: Remove javac warnings from Character.java
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings
> 6934271: Better handling of longer utf-8 sequences
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/utf8-twiddling
> 6666666: Optimize bit-twiddling in Bits.java
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/qtip tip Bits.java
>
> Now I need to go off to my micro-optimizers-anonymous meeting.
>
> Martin
>
> On Thu, Mar 11, 2010 at 13:24, Xueming Shen <Xueming.Shen at sun.com> wrote:
>   
>> Martin, Ulf
>>
>> Following bug/rfs have been filed.
>>
>> 6934265 Add public method Character.isBMPCodePoint
>> 6934268 Better implementation of Character.isValidCodePoint and
>> isSupplementaryCodePoint()
>> 6934270: Remove javac warnings from Character.java
>> 6934271: Better handling of longer utf-8 sequences
>>
>> Masayoshi, Alan would you please help review the corresponding CCC for
>> 6934265 at
>> http://ccc.sfbay.sun.com/6934265
>>
>> Martin, don't touch the utf-8 malformed issue for now, and incompatible
>> change in UTF-8
>> is A issue.
>>
>> sherman
>>
>> Martin Buchholz wrote:
>>     
>>> Ulf, your changes would be easier to get in
>>> if they were organized as mq patch files that
>>> could be qimported into an existing mq repo.
>>>
>>> I've done that below, which includes a subset of
>>> your own proposed changes:
>>>
>>>
>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint/
>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/
>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings/
>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/malformed-utf8/
>>>
>>> Sherman (or Alan),
>>>
>>> please review and/or file bugs for the above changes.
>>>
>>> isBMPCodePoint is a spec addition, requiring additional paperwork.
>>>
>>> Sherman, you owe me a response to my now-moldy proposed changes to
>>> the UTF-8 charset.
>>>
>>> The only controversial change would be the change in behavior in
>>> malformed-utf8, which I can take out.
>>>
>>> Martin
>>>
>>> On Thu, Mar 11, 2010 at 10:32, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>
>>>       
>>>> Sherman,
>>>>
>>>> I know, your time ...
>>>>
>>>> ... but maybe someone is needed for sponsor here:
>>>> https://bugs.openjdk.java.net/show_bug.cgi?id=100132
>>>>
>>>> Could you do this?
>>>>
>>>> Much thanks,
>>>>
>>>> -Ulf
>>>>
>>>>
>>>> Am 10.03.2010 19:23, schrieb Xueming Shen:
>>>>
>>>>         
>>>>> approved.
>>>>>
>>>>> I don't have a spare ws right now.so please just push, it's almost
>>>>> there:-)
>>>>>
>>>>> sherman
>>>>>
>>>>> Martin Buchholz wrote:
>>>>>
>>>>>           
>>>>>> Here's the proposed fix for
>>>>>> 6931812: A better implementation of sun.nio.cs.Surrogate.isBMP(int)
>>>>>>
>>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint/
>>>>>>
>>>>>> I changed the name to isBMPCodePoint in preparation for moving
>>>>>> it to Character.java.
>>>>>> (Sherman, perhaps you would like to take on that followon task?)
>>>>>>
>>>>>> Sherman, please approve.
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>> On Sat, Mar 6, 2010 at 13:00, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>>>
>>>>>>             
>>>>>>> Very fast Sherman, much thanks.
>>>>>>>
>>>>>>> Could you set the bug to accepted and evaluated, so my patch will have
>>>>>>> a
>>>>>>> chance to get into the code base?
>>>>>>>
>>>>>>> -Ulf
>>>>>>>
>>>>>>>
>>>>>>> Am 03.03.2010 20:11, schrieb Xueming Shen:
>>>>>>>
>>>>>>>               
>>>>>>>> #6931812
>>>>>>>>
>>>>>>>> Martin Buchholz wrote:
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> Sherman, would you like to file bugs for Ulf's improvements?
>>>>>>>>>
>>>>>>>>> On Wed, Mar 3, 2010 at 02:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> Am 03.03.2010 09:00, schrieb Martin Buchholz:
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> Keep in mind that supplementary characters are extremely rare.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> Yes, but many API's in the JDK are used rarely.
>>>>>>>>>> Why should they waste memory footprint / perform bad, particularly
>>>>>>>>>> if
>>>>>>>>>> it
>>>>>>>>>> doesn't cost anything.
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> I admire your perfectionism.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>>> Therefore the existing implementation
>>>>>>>>>>>
>>>>>>>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>>>>>>> &&  codePoint<= MAX_CODE_POINT;
>>>>>>>>>>>
>>>>>>>>>>> will almost always perform just one comparison against a constant,
>>>>>>>>>>> which is hard to beat.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>>>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>>>>>>> against
>>>>>>>>>> only 8 bit values in my case.
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> It's a good point.  In the machine code, shifts are likely to use
>>>>>>>>> immediate values, and so will be a small win.
>>>>>>>>>
>>>>>>>>> int x = codePoint >>> 16;
>>>>>>>>> return x != 0 && x < 0x11;
>>>>>>>>>
>>>>>>>>> (On modern hardware, these optimizations
>>>>>>>>> are less valuable than they used to be;
>>>>>>>>> ordinary integer arithmetic is almost free)
>>>>>>>>>
>>>>>>>>> Martin
>>>>>>>>>
>>>>>>>>>                   
>>>>>           
>>>>         
>>     


From christopher.hegarty at sun.com  Tue Mar 16 10:06:44 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Tue, 16 Mar 2010 10:06:44 +0000
Subject: hg: jdk7/tl/jdk: 6934923: test/java/net/ipv6tests/TcpTest.java hangs
	on Solaris 10
Message-ID: <20100316100704.71B7644D83@hg.openjdk.java.net>

Changeset: f88f6f8ddd21
Author:    chegar
Date:      2010-03-16 10:05 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/f88f6f8ddd21

6934923: test/java/net/ipv6tests/TcpTest.java hangs on Solaris 10
Reviewed-by: alanb

! test/java/net/ipv6tests/TcpTest.java
! test/java/net/ipv6tests/Tests.java


From christopher.hegarty at sun.com  Tue Mar 16 14:34:19 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Tue, 16 Mar 2010 14:34:19 +0000
Subject: hg: jdk7/tl/jdk: 6935199: java/net regression tests failing with
	Assertions
Message-ID: <20100316143439.1BD6244DC5@hg.openjdk.java.net>

Changeset: 895a1211b2e1
Author:    chegar
Date:      2010-03-16 14:31 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/895a1211b2e1

6935199: java/net regression tests failing with Assertions
Reviewed-by: michaelm

! test/ProblemList.txt
! test/java/net/CookieHandler/TestHttpCookie.java
! test/java/net/URLClassLoader/closetest/CloseTest.java


From Xueming.Shen at Sun.COM  Tue Mar 16 20:06:53 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Tue, 16 Mar 2010 12:06:53 -0800
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
Message-ID: <4B9FE4DD.1090405@sun.com>

Martin Buchholz wrote:
> Therefore the existing implementation
>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>             &&  codePoint<= MAX_CODE_POINT;
>>>
>>> will almost always perform just one comparison against a constant,
>>> which is hard to beat.
>>>
>>>       
>> 1. Wondering: I think there are TWO comparisons.
>> 2. Those comparisons need to load 32 bit values from machine code, against
>> only 8 bit values in my case.
>>     
>
> It's a good point.  In the machine code, shifts are likely to use
> immediate values, and so will be a small win.
>
> int x = codePoint >>> 16;
> return x != 0 && x < 0x11;
>
> (On modern hardware, these optimizations
> are less valuable than they used to be;
> ordinary integer arithmetic is almost free)
>
>   

I'm not convinced if the proposed code is really better...a "small win".

Without seeing the real native machine code generated, I'm not sure
if

       0: iload_0 
       1: bipush        16
       3: iushr         
       4: istore_1      
       5: iload_1       
       6: ifeq          19

is really better than

       0: iload_0 
       1: ldc           #2                  // int 65536
       3: if_icmplt     16


for bmp character case, especially given the existing code has better 
readability and yes, shorter....

Yes, shift might be able to use the immediate values, but it still needs 
to handle the "operands"
and it is an extra operation. The only chance the new one might be 
better is that the "ifeq" is
faster than "if_icmplt", but have not worked on the instruction set 
level for too long, so I can't
tell (kinda remember you have to check the "circles" of each operation 
to see which one is
"faster" during my old gcc compiler day)

OK, convince me:-)

-Sherman


public class Character extends java.lang.Object {
  public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;

  public static final int MAX_CODE_POINT = 1114111;

  public Character();
    Code:
       0: aload_0       
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return        

  public static boolean isSupplementaryCodePoint(int);
    Code:
       0: iload_0       
       1: ldc           #2                  // int 65536
       3: if_icmplt     16
       6: iload_0       
       7: ldc           #3                  // int 1114111
       9: if_icmpgt     16
      12: iconst_1      
      13: goto          17
      16: iconst_0      
      17: ireturn       

  public static boolean isSupplementaryCodePoint_new(int);
    Code:
       0: iload_0       
       1: bipush        16
       3: iushr         
       4: istore_1      
       5: iload_1       
       6: ifeq          19
       9: iload_1       
      10: bipush        17
      12: if_icmpge     19
      15: iconst_1      
      16: goto          20
      19: iconst_0      
      20: ireturn       
}


From Ulf.Zibis at gmx.de  Tue Mar 16 19:48:07 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 16 Mar 2010 20:48:07 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
Message-ID: <4B9FE077.3060608@gmx.de>

Here my additions:

Am 13.03.2010 00:04, schrieb Martin Buchholz:
>> - Why don't you like using the new isBMPCodePoint() for
>> isSupplementaryCodePoint() and toUpperCaseCharArray() ?
>>      
> I now use it for the assert in toUpperCaseCharArray()
>    

     return !isBMPCodePoint() && isValidCodePoint();
resolves in same than current code.

>    
>> - Same shift magic would enhance isISOControl(), isHighSurrogate(),
>> isLowSurrogate(), in particular if latter occur consecutive.
>>      
> isISOControl - yes, others - I am not convinced.
>    

If virtually shifted by 8, HotSpot could use cheaper 1-byte compare on 
the high byte.
Additionally, those methods are often used consecutively, so all 4 
compares would benefit from.

>>   8-bit shift + compare would allow HotSpot to compile to smart 1-byte
>> immediate op-codes.
>> In encodeBufferLoop() you could use putChar(), putInt() instead put().
>> Should perform better.
>>      
> I'm not convinced.  You would need to assemble bytes into an
> int, and then break them apart into bytes on the other side?
>    

Some time ago, I disassembled such code. I could see, that the int was 
copied directly to memory by one 32-bit move instruction.
In case of using put(byte), I saw 4 8-bit move instructions.

I not have dissassembled if a 3-byte value first would be collected in a 
3-byte byte[] and then copied by put(byte[]). Maybe HotSpot could 
optimize here too.

Try it out. 2 will see more than 1. Maybe I was in error.

BTW: for the same optimization, I would like to have putInt() and 
putLong() in Charbuffer, ShortBuffer and for the latter in IntBuffer.


-Ulf


From Ulf.Zibis at gmx.de  Tue Mar 16 20:10:08 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 16 Mar 2010 21:10:08 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4B9FE4DD.1090405@sun.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
Message-ID: <4B9FE5A0.7010409@gmx.de>

Here you can see, how HotSpot could benefit from that bit twiddling:

I've filed some bugs against HotSpot to optimize those cases:
6932837 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6932837> - 
Better use unsigned jump if one of the range limits is 0
6933327 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933327> - 
Use shifted addressing modes instead of shift instuctions

-Ulf


Am 16.03.2010 21:06, schrieb Xueming Shen:
> Martin Buchholz wrote:
>> Therefore the existing implementation
>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>> &&  codePoint<= MAX_CODE_POINT;
>>>>
>>>> will almost always perform just one comparison against a constant,
>>>> which is hard to beat.
>>>>
>>> 1. Wondering: I think there are TWO comparisons.
>>> 2. Those comparisons need to load 32 bit values from machine code, 
>>> against
>>> only 8 bit values in my case.
>>
>> It's a good point.  In the machine code, shifts are likely to use
>> immediate values, and so will be a small win.
>>
>> int x = codePoint >>> 16;
>> return x != 0 && x < 0x11;
>>
>> (On modern hardware, these optimizations
>> are less valuable than they used to be;
>> ordinary integer arithmetic is almost free)
>>
>
> I'm not convinced if the proposed code is really better...a "small win".
>
> Without seeing the real native machine code generated, I'm not sure
> if
>
>       0: iload_0       1: bipush        16
>       3: iushr               4: istore_1            5: iload_1       
>       6: ifeq          19
>
> is really better than
>
>       0: iload_0       1: ldc           #2                  // int 65536
>       3: if_icmplt     16
>
>
> for bmp character case, especially given the existing code has better 
> readability and yes, shorter....
>
> Yes, shift might be able to use the immediate values, but it still 
> needs to handle the "operands"
> and it is an extra operation. The only chance the new one might be 
> better is that the "ifeq" is
> faster than "if_icmplt", but have not worked on the instruction set 
> level for too long, so I can't
> tell (kinda remember you have to check the "circles" of each operation 
> to see which one is
> "faster" during my old gcc compiler day)
>
> OK, convince me:-)
>
> -Sherman
>
>
> public class Character extends java.lang.Object {
>  public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>
>  public static final int MAX_CODE_POINT = 1114111;
>
>  public Character();
>    Code:
>       0: aload_0             1: invokespecial #1                  // 
> Method java/lang/Object."<init>":()V
>       4: return
>  public static boolean isSupplementaryCodePoint(int);
>    Code:
>       0: iload_0             1: ldc           #2                  // 
> int 65536
>       3: if_icmplt     16
>       6: iload_0             7: ldc           #3                  // 
> int 1114111
>       9: if_icmpgt     16
>      12: iconst_1           13: goto          17
>      16: iconst_0           17: ireturn
>  public static boolean isSupplementaryCodePoint_new(int);
>    Code:
>       0: iload_0             1: bipush        16
>       3: iushr               4: istore_1            5: iload_1       
>       6: ifeq          19
>       9: iload_1            10: bipush        17
>      12: if_icmpge     19
>      15: iconst_1           16: goto          20
>      19: iconst_0           20: ireturn       }
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100316/05b34055/attachment.html>

From martinrb at google.com  Tue Mar 16 20:28:05 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 16 Mar 2010 13:28:05 -0700
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4B9FE077.3060608@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B8EB46C.1010208@sun.com>
	<4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
	<4B9FE077.3060608@gmx.de>
Message-ID: <1ccfd1c11003161328i5334041fre25eb31d9fa53e9e@mail.gmail.com>

On Tue, Mar 16, 2010 at 12:48, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Here my additions:
>
> Am 13.03.2010 00:04, schrieb Martin Buchholz:
>>>
>>> - Why don't you like using the new isBMPCodePoint() for
>>> isSupplementaryCodePoint() and toUpperCaseCharArray() ?
>>>
>>
>> I now use it for the assert in toUpperCaseCharArray()
>>
>
> ? ?return !isBMPCodePoint() && isValidCodePoint();
> resolves in same than current code.

Hmmmm......

Yes, you've convinced me!
Done.

>>> - Same shift magic would enhance isISOControl(), isHighSurrogate(),
>>> isLowSurrogate(), in particular if latter occur consecutive.
>>>
>>
>> isISOControl - yes, others - I am not convinced.
>>
>
> If virtually shifted by 8, HotSpot could use cheaper 1-byte compare on the
> high byte.
> Additionally, those methods are often used consecutively, so all 4 compares
> would benefit from.

Sorry, I'm still not convinced for the surrogate testing methods.
Almost all chars are less than MIN_SURROGATE, so you have to beat
the already amazingly cheap
x >= MIN_SURROGATE.

Martin


From Xueming.Shen at Sun.COM  Tue Mar 16 21:30:47 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Tue, 16 Mar 2010 13:30:47 -0800
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4B9FE5A0.7010409@gmx.de>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com> <4B9FE5A0.7010409@gmx.de>
Message-ID: <4B9FF887.6080402@sun.com>

What did you mean "Hotspot could benefit from..."

Are you saying?

if    ( 6932837 gets fixed ) {
    existing isSupplementaryCodePoint() impl is better
} else if ( 6933327 gets fixed ) {
    the proposed is better
} else {
    existing isSupplementaryCodePoint() impl might still be better
}

So we will only see any benefit if they "don't fix 6932837, but fix 
6933327"?

-Sherman


Ulf Zibis wrote:
> Here you can see, how HotSpot could benefit from that bit twiddling:
>
> I've filed some bugs against HotSpot to optimize those cases:
> 6932837 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6932837> - 
> Better use unsigned jump if one of the range limits is 0
> 6933327 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933327> - 
> Use shifted addressing modes instead of shift instuctions
>
> -Ulf
>
>
> Am 16.03.2010 21:06, schrieb Xueming Shen:
>> Martin Buchholz wrote:
>>> Therefore the existing implementation
>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>             &&  codePoint<= MAX_CODE_POINT;
>>>>>
>>>>> will almost always perform just one comparison against a constant,
>>>>> which is hard to beat.
>>>>>
>>>>>       
>>>> 1. Wondering: I think there are TWO comparisons.
>>>> 2. Those comparisons need to load 32 bit values from machine code, 
>>>> against
>>>> only 8 bit values in my case.
>>>>     
>>>
>>> It's a good point.  In the machine code, shifts are likely to use
>>> immediate values, and so will be a small win.
>>>
>>> int x = codePoint >>> 16;
>>> return x != 0 && x < 0x11;
>>>
>>> (On modern hardware, these optimizations
>>> are less valuable than they used to be;
>>> ordinary integer arithmetic is almost free)
>>>
>>>   
>>
>> I'm not convinced if the proposed code is really better...a "small win".
>>
>> Without seeing the real native machine code generated, I'm not sure
>> if
>>
>>       0: iload_0       1: bipush        16
>>       3: iushr               4: istore_1            5: iload_1       
>>       6: ifeq          19
>>
>> is really better than
>>
>>       0: iload_0       1: ldc           #2                  // int 65536
>>       3: if_icmplt     16
>>
>>
>> for bmp character case, especially given the existing code has better 
>> readability and yes, shorter....
>>
>> Yes, shift might be able to use the immediate values, but it still 
>> needs to handle the "operands"
>> and it is an extra operation. The only chance the new one might be 
>> better is that the "ifeq" is
>> faster than "if_icmplt", but have not worked on the instruction set 
>> level for too long, so I can't
>> tell (kinda remember you have to check the "circles" of each 
>> operation to see which one is
>> "faster" during my old gcc compiler day)
>>
>> OK, convince me:-)
>>
>> -Sherman
>>
>>
>> public class Character extends java.lang.Object {
>>  public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>>
>>  public static final int MAX_CODE_POINT = 1114111;
>>
>>  public Character();
>>    Code:
>>       0: aload_0             1: invokespecial #1                  // 
>> Method java/lang/Object."<init>":()V
>>       4: return       
>>  public static boolean isSupplementaryCodePoint(int);
>>    Code:
>>       0: iload_0             1: ldc           #2                  // 
>> int 65536
>>       3: if_icmplt     16
>>       6: iload_0             7: ldc           #3                  // 
>> int 1114111
>>       9: if_icmpgt     16
>>      12: iconst_1           13: goto          17
>>      16: iconst_0           17: ireturn      
>>  public static boolean isSupplementaryCodePoint_new(int);
>>    Code:
>>       0: iload_0             1: bipush        16
>>       3: iushr               4: istore_1            5: iload_1       
>>       6: ifeq          19
>>       9: iload_1            10: bipush        17
>>      12: if_icmpge     19
>>      15: iconst_1           16: goto          20
>>      19: iconst_0           20: ireturn       }
>>
>>


From Ulf.Zibis at gmx.de  Tue Mar 16 20:36:16 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 16 Mar 2010 21:36:16 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003161328i5334041fre25eb31d9fa53e9e@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8EB46C.1010208@sun.com>	
	<4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>	
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>	
	<4B9FE077.3060608@gmx.de>
	<1ccfd1c11003161328i5334041fre25eb31d9fa53e9e@mail.gmail.com>
Message-ID: <4B9FEBC0.8070200@gmx.de>

Am 16.03.2010 21:28, schrieb Martin Buchholz:
> On Tue, Mar 16, 2010 at 12:48, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>
> Hmmmm......
>
> Yes, you've convinced me!
> Done.
>    

THE meeting had it's success. ;-)


>    
>>>> - Same shift magic would enhance isISOControl(), isHighSurrogate(),
>>>> isLowSurrogate(), in particular if latter occur consecutive.
>>>>
>>>>          
>>> isISOControl - yes, others - I am not convinced.
>>>
>>>        
>> If virtually shifted by 8, HotSpot could use cheaper 1-byte compare on the
>> high byte.
>> Additionally, those methods are often used consecutively, so all 4 compares
>> would benefit from.
>>      
> Sorry, I'm still not convinced for the surrogate testing methods.
> Almost all chars are less than MIN_SURROGATE, so you have to beat
> the already amazingly cheap
> x>= MIN_SURROGATE.
>    

Good point, but ...
... what about :
6933327 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933327> - 
Use shifted addressing modes instead of shift instuctions
and internal review ID of 1735166

-Ulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100316/797f3597/attachment.html>

From martinrb at google.com  Tue Mar 16 20:46:26 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 16 Mar 2010 13:46:26 -0700
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4B9FE999.3050106@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B8EB46C.1010208@sun.com>
	<4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995F9C.3070705@sun.com>
	<1ccfd1c11003121529r22651bfcnfca6435311d707a6@mail.gmail.com>
	<4B9FE999.3050106@gmx.de>
Message-ID: <1ccfd1c11003161346o3496bc39gfa40583abd6bb8c9@mail.gmail.com>

On Tue, Mar 16, 2010 at 13:27, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 13.03.2010 00:29, schrieb Martin Buchholz:
>
> Won't you like to add:
> ? ? * <p><b>Note:</b> In combination with {@link #isBMPCodePoint(int)} this
> ? ? * method should be in 2nd place to permit additional HotSpot compiler
> ? ? * optimization. Example:
> ? ? * <blockquote><pre>
> ? ? * ? ? if (Character.isBMPCodePoint(codePoint))
> ? ? * ? ? ? ? ...;
> ? ? * ? ? else if (Character.isSupplementaryCodePoint(codePoint))
> ? ? * ? ? ? ? ...;
> ? ? * ? ? else
> ? ? * ? ? ? ? ...;
> ? ? * </pre></blockquote>
> ? ? *

No.

This kind of implementation-specific comment is not
traditionally put in public javadoc (it's considered OK
in private comments).  Also, we should not inflict our
dangerous micro-optimization disease on others.

>> 6934265: Add public method Character.isBMPCodePoint
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
>>
>
> Additionally please move static final int SIZE = 16 to one of the first
> lines of the code.
> See: https://bugs.openjdk.java.net/attachment.cgi?id=178&action=diff

No.

Although I agree with you that SIZE would be better near the top of the
class, I am not going to move it, at least not now.
For consistency, the SIZE fields in related classes like Short
should be moved as well.

>> 6934270: Remove javac warnings from Character.java
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings
>> 6934271: Better handling of longer utf-8 sequences
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/utf8-twiddling
>> 6666666: Optimize bit-twiddling in Bits.java
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Bits.java
>>
>
> Hm, I can't see any difference that would merit to see it as
> micro-optimization. Am I blind?

Bytecode is smaller.

>> Now I need to go off to my micro-optimizers-anonymous meeting.
>>
>
> Oh, you are coming to Cologne, Germany. Nice to meet you personally.

Das letzte Mal war ich in K?ln zu Bewerbungsinterview.
Leider nur Stellungen in der Versicherungsindustrie zu der Zeit.

Martin


From martinrb at google.com  Tue Mar 16 20:50:55 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 16 Mar 2010 13:50:55 -0700
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4B9FEBC0.8070200@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
	<4B9FE077.3060608@gmx.de>
	<1ccfd1c11003161328i5334041fre25eb31d9fa53e9e@mail.gmail.com>
	<4B9FEBC0.8070200@gmx.de>
Message-ID: <1ccfd1c11003161350n25d12225kfb7621dee1a1d415@mail.gmail.com>

On Tue, Mar 16, 2010 at 13:36, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 16.03.2010 21:28, schrieb Martin Buchholz:

> Sorry, I'm still not convinced for the surrogate testing methods.
> Almost all chars are less than MIN_SURROGATE, so you have to beat
> the already amazingly cheap
> x >= MIN_SURROGATE.
>
>
> Good point, but ...
> ... what about :
> 6933327 - Use shifted addressing modes instead of shift instuctions
> and internal review ID of 1735166

Although I do worry about what hotspot will do with my bytecode
(as you know),
I mostly try to think more abstractly about the JIT and
simply produce high-quality JIT-friendly bytecode.

Your considerations in 6933327 seem valuable,
but are targeted at only one runtime compiler,
on only one machine architecture.

Martin


From Ulf.Zibis at gmx.de  Tue Mar 16 20:58:33 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 16 Mar 2010 21:58:33 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4B9FF887.6080402@sun.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com> <4B9FE5A0.7010409@gmx.de>
	<4B9FF887.6080402@sun.com>
Message-ID: <4B9FF0F9.4090606@gmx.de>

Very descriptive visualization.  :-)

I mean:
if    ( 6932837 && (review ID of 1735166) gets fixed ) {
    existing isSupplementaryCodePoint() impl is BEST
} else if ( 6933327 gets fixed ) {
    the proposed is better
} else {
    proposed is still little better than existing 
isSupplementaryCodePoint() impl
}

Additionally, toUpperCaseCharArray(), codePointCountImpl(), 
String(int[], int, int) would profit from consecutive use of 
isBMPCodePoint + isSupplementaryCodePoint() or isHighSurrogate() + 
isLowSurrogate.

 > So we will only see any benefit if they "don't fix 6932837, but fix 
6933327"?
fix 6932837 wouldn't harm, but other code, using shift by 8 | 16 would 
benefit from.

-Ulf


Am 16.03.2010 22:30, schrieb Xueming Shen:
> What did you mean "Hotspot could benefit from..."
>
> Are you saying?
>
> if    ( 6932837 gets fixed ) {
>    existing isSupplementaryCodePoint() impl is better
> } else if ( 6933327 gets fixed ) {
>    the proposed is better
> } else {
>    existing isSupplementaryCodePoint() impl might still be better
> }
>
> So we will only see any benefit if they "don't fix 6932837, but fix 
> 6933327"?
>
> -Sherman
>
>
> Ulf Zibis wrote:
>> Here you can see, how HotSpot could benefit from that bit twiddling:
>>
>> I've filed some bugs against HotSpot to optimize those cases:
>> 6932837 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6932837> 
>> - Better use unsigned jump if one of the range limits is 0
>> 6933327 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933327> 
>> - Use shifted addressing modes instead of shift instuctions
>>
>> -Ulf
>>
>>
>> Am 16.03.2010 21:06, schrieb Xueming Shen:
>>> Martin Buchholz wrote:
>>>> Therefore the existing implementation
>>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>> &&  codePoint<= MAX_CODE_POINT;
>>>>>>
>>>>>> will almost always perform just one comparison against a constant,
>>>>>> which is hard to beat.
>>>>>>
>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>> 2. Those comparisons need to load 32 bit values from machine code, 
>>>>> against
>>>>> only 8 bit values in my case.
>>>>
>>>> It's a good point.  In the machine code, shifts are likely to use
>>>> immediate values, and so will be a small win.
>>>>
>>>> int x = codePoint >>> 16;
>>>> return x != 0 && x < 0x11;
>>>>
>>>> (On modern hardware, these optimizations
>>>> are less valuable than they used to be;
>>>> ordinary integer arithmetic is almost free)
>>>>
>>>
>>> I'm not convinced if the proposed code is really better...a "small 
>>> win".
>>>
>>> Without seeing the real native machine code generated, I'm not sure
>>> if
>>>
>>>       0: iload_0       1: bipush        16
>>>       3: iushr               4: istore_1            5: iload_1       
>>>       6: ifeq          19
>>>
>>> is really better than
>>>
>>>       0: iload_0       1: ldc           #2                  // int 
>>> 65536
>>>       3: if_icmplt     16
>>>
>>>
>>> for bmp character case, especially given the existing code has 
>>> better readability and yes, shorter....
>>>
>>> Yes, shift might be able to use the immediate values, but it still 
>>> needs to handle the "operands"
>>> and it is an extra operation. The only chance the new one might be 
>>> better is that the "ifeq" is
>>> faster than "if_icmplt", but have not worked on the instruction set 
>>> level for too long, so I can't
>>> tell (kinda remember you have to check the "circles" of each 
>>> operation to see which one is
>>> "faster" during my old gcc compiler day)
>>>
>>> OK, convince me:-)
>>>
>>> -Sherman
>>>
>>>
>>> public class Character extends java.lang.Object {
>>>  public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>>>
>>>  public static final int MAX_CODE_POINT = 1114111;
>>>
>>>  public Character();
>>>    Code:
>>>       0: aload_0             1: invokespecial #1                  // 
>>> Method java/lang/Object."<init>":()V
>>>       4: return        public static boolean 
>>> isSupplementaryCodePoint(int);
>>>    Code:
>>>       0: iload_0             1: ldc           #2                  // 
>>> int 65536
>>>       3: if_icmplt     16
>>>       6: iload_0             7: ldc           #3                  // 
>>> int 1114111
>>>       9: if_icmpgt     16
>>>      12: iconst_1           13: goto          17
>>>      16: iconst_0           17: ireturn       public static boolean 
>>> isSupplementaryCodePoint_new(int);
>>>    Code:
>>>       0: iload_0             1: bipush        16
>>>       3: iushr               4: istore_1            5: iload_1       
>>>       6: ifeq          19
>>>       9: iload_1            10: bipush        17
>>>      12: if_icmpge     19
>>>      15: iconst_1           16: goto          20
>>>      19: iconst_0           20: ireturn       }
>>>
>>>
>
>


From martinrb at google.com  Tue Mar 16 21:09:13 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 16 Mar 2010 14:09:13 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4B9FE4DD.1090405@sun.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
Message-ID: <1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>

On Tue, Mar 16, 2010 at 13:06, Xueming Shen <Xueming.Shen at sun.com> wrote:
> Martin Buchholz wrote:
>>
>> Therefore the existing implementation
>>>>
>>>> ?return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>> ? ? ? ? ? ?&& ?codePoint<= MAX_CODE_POINT;
>>>>
>>>> will almost always perform just one comparison against a constant,
>>>> which is hard to beat.
>>>>
>>>>
>>>
>>> 1. Wondering: I think there are TWO comparisons.
>>> 2. Those comparisons need to load 32 bit values from machine code,
>>> against
>>> only 8 bit values in my case.
>>>
>>
>> It's a good point. ?In the machine code, shifts are likely to use
>> immediate values, and so will be a small win.
>>
>> int x = codePoint >>> 16;
>> return x != 0 && x < 0x11;
>>
>> (On modern hardware, these optimizations
>> are less valuable than they used to be;
>> ordinary integer arithmetic is almost free)
>>
>>
>
> I'm not convinced if the proposed code is really better...a "small win".

The primary theory here is that branches are expensive,
and we are reducing them by one.

> Without seeing the real native machine code generated, I'm not sure
> if
>
> ? ? ?0: iload_0 ? ? ? 1: bipush ? ? ? ?16
> ? ? ?3: iushr ? ? ? ? ? ? ? 4: istore_1 ? ? ? ? ? ?5: iload_1 ? ? ? ? ? ? 6:
> ifeq ? ? ? ? ?19
>
> is really better than
>
> ? ? ?0: iload_0 ? ? ? 1: ldc ? ? ? ? ? #2 ? ? ? ? ? ? ? ? ?// int 65536
> ? ? ?3: if_icmplt ? ? 16
>
>
> for bmp character case, especially given the existing code has better
> readability and yes, shorter....

The very latest version of the code is Ulf's readable and optimal
(as long as it is inlined)

    public static boolean isSupplementaryCodePoint(int codePoint) {
         return !isBMPCodePoint(codePoint) && isValidCodePoint(codePoint);
    }

Martin

> Yes, shift might be able to use the immediate values, but it still needs to
> handle the "operands"
> and it is an extra operation. The only chance the new one might be better is
> that the "ifeq" is
> faster than "if_icmplt", but have not worked on the instruction set level
> for too long, so I can't
> tell (kinda remember you have to check the "circles" of each operation to
> see which one is
> "faster" during my old gcc compiler day)
>
> OK, convince me:-)
>
> -Sherman
>
>
> public class Character extends java.lang.Object {
> ?public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>
> ?public static final int MAX_CODE_POINT = 1114111;
>
> ?public Character();
> ? Code:
> ? ? ?0: aload_0 ? ? ? ? ? ? 1: invokespecial #1 ? ? ? ? ? ? ? ? ?// Method
> java/lang/Object."<init>":()V
> ? ? ?4: return
> ?public static boolean isSupplementaryCodePoint(int);
> ? Code:
> ? ? ?0: iload_0 ? ? ? ? ? ? 1: ldc ? ? ? ? ? #2 ? ? ? ? ? ? ? ? ?// int
> 65536
> ? ? ?3: if_icmplt ? ? 16
> ? ? ?6: iload_0 ? ? ? ? ? ? 7: ldc ? ? ? ? ? #3 ? ? ? ? ? ? ? ? ?// int
> 1114111
> ? ? ?9: if_icmpgt ? ? 16
> ? ? 12: iconst_1 ? ? ? ? ? 13: goto ? ? ? ? ?17
> ? ? 16: iconst_0 ? ? ? ? ? 17: ireturn
> ?public static boolean isSupplementaryCodePoint_new(int);
> ? Code:
> ? ? ?0: iload_0 ? ? ? ? ? ? 1: bipush ? ? ? ?16
> ? ? ?3: iushr ? ? ? ? ? ? ? 4: istore_1 ? ? ? ? ? ?5: iload_1 ? ? ? ? ? ? 6:
> ifeq ? ? ? ? ?19
> ? ? ?9: iload_1 ? ? ? ? ? ?10: bipush ? ? ? ?17
> ? ? 12: if_icmpge ? ? 19
> ? ? 15: iconst_1 ? ? ? ? ? 16: goto ? ? ? ? ?20
> ? ? 19: iconst_0 ? ? ? ? ? 20: ireturn ? ? ? }
>
>


From Ulf.Zibis at gmx.de  Tue Mar 16 20:27:05 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 16 Mar 2010 21:27:05 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003121529r22651bfcnfca6435311d707a6@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995F9C.3070705@sun.com>
	<1ccfd1c11003121529r22651bfcnfca6435311d707a6@mail.gmail.com>
Message-ID: <4B9FE999.3050106@gmx.de>

Am 13.03.2010 00:29, schrieb Martin Buchholz:
> OK, next round of review.
>
> I changed my UTF-8 changes to be behavior-preserving,
> removing any hint of controversy, and renamed the patch
> to "utf8-twiddling".
>
> I got Ulf in my head, and can't stop micro-optimizing.
> I added a new micro-optimizing patch for Bits.java.
> Please file a bug.
>
> 6934268: Better implementation of Character.isValidCodePoint and
> isSupplementaryCodePoint()
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isSupplementaryCodePoint
>    

Won't you like to add:
      * <p><b>Note:</b> In combination with {@link #isBMPCodePoint(int)} 
this
      * method should be in 2nd place to permit additional HotSpot compiler
      * optimization. Example:
      * <blockquote><pre>
      *     if (Character.isBMPCodePoint(codePoint))
      *         ...;
      *     else if (Character.isSupplementaryCodePoint(codePoint))
      *         ...;
      *     else
      *         ...;
      * </pre></blockquote>
      *

> 6934265: Add public method Character.isBMPCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
>    

Additionally please move static final int SIZE = 16 to one of the first 
lines of the code.
See: https://bugs.openjdk.java.net/attachment.cgi?id=178&action=diff

> 6934270: Remove javac warnings from Character.java
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings
> 6934271: Better handling of longer utf-8 sequences
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/utf8-twiddling
> 6666666: Optimize bit-twiddling in Bits.java
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Bits.java
>    

Hm, I can't see any difference that would merit to see it as 
micro-optimization. Am I blind?

> Now I need to go off to my micro-optimizers-anonymous meeting.
>    

Oh, you are coming to Cologne, Germany. Nice to meet you personally.

-Ulf


From Xueming.Shen at Sun.COM  Tue Mar 16 22:35:16 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Tue, 16 Mar 2010 14:35:16 -0800
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
Message-ID: <4BA007A4.2030907@sun.com>

Martin Buchholz wrote:
> On Tue, Mar 16, 2010 at 13:06, Xueming Shen <Xueming.Shen at sun.com> wrote:
>   
>> Martin Buchholz wrote:
>>     
>>> Therefore the existing implementation
>>>       
>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>            &&  codePoint<= MAX_CODE_POINT;
>>>>>
>>>>> will almost always perform just one comparison against a constant,
>>>>> which is hard to beat.
>>>>>
>>>>>
>>>>>           
>>>> 1. Wondering: I think there are TWO comparisons.
>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>> against
>>>> only 8 bit values in my case.
>>>>
>>>>         
>>> It's a good point.  In the machine code, shifts are likely to use
>>> immediate values, and so will be a small win.
>>>
>>> int x = codePoint >>> 16;
>>> return x != 0 && x < 0x11;
>>>
>>> (On modern hardware, these optimizations
>>> are less valuable than they used to be;
>>> ordinary integer arithmetic is almost free)
>>>
>>>
>>>       
>> I'm not convinced if the proposed code is really better...a "small win".
>>     
>
> The primary theory here is that branches are expensive,
> and we are reducing them by one.
>
>   

There are still two branches in new impl, if you count the "ifeq" and 
"if_icmpge"(?)

We are trying to "optimize" this piece of code with the assumption that 
the new impl MIGHT help certain vm (hotspot?)
to optimize certain use scenario (some consecutive usages), if the 
compiler and/or the vm are both smart enough at certain
point, with no supporting benchmark data?

My concern is that the reality might be that this optimization might 
even hurt the BMP use
case (the majority of the possible real world use scenarios) with a 10% 
bigger bytecode size.

-Sherman


public class Character extends java.lang.Object {
  public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;

  public static final int MAX_CODE_POINT = 1114111;

  public Character();
    Code:
       0: aload_0       
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return        

  public static boolean isSupplementaryCodePoint(int);
    Code:
       0: iload_0       
       1: ldc           #2                  // int 65536
       3: if_icmplt     16
       6: iload_0       
       7: ldc           #3                  // int 1114111
       9: if_icmpgt     16
      12: iconst_1      
      13: goto          17
      16: iconst_0      
      17: ireturn       

  public static boolean isSupplementaryCodePoint_new(int);
    Code:
       0: iload_0       
       1: bipush        16
       3: iushr         
       4: istore_1
       5: iload_1       
       6: ifeq          19
       9: iload_1       
      10: bipush        17
      12: if_icmpge     19
      15: iconst_1      
      16: goto          20
      19: iconst_0      
      20: ireturn       
}


From martinrb at google.com  Tue Mar 16 21:36:18 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 16 Mar 2010 14:36:18 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4B9FF0F9.4090606@gmx.de>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com> <4B9FE5A0.7010409@gmx.de>
	<4B9FF887.6080402@sun.com> <4B9FF0F9.4090606@gmx.de>
Message-ID: <1ccfd1c11003161436o191295f2r9464d715488cb16d@mail.gmail.com>

On Tue, Mar 16, 2010 at 13:58, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:

> Additionally, toUpperCaseCharArray(), codePointCountImpl(), String(int[],
> int, int) would profit from consecutive use of isBMPCodePoint +
> isSupplementaryCodePoint() or isHighSurrogate() + isLowSurrogate.

For codePointCountImpl(), I do not agree.

For String(int[], int, int), I do agree.

Here is my latest more readable and more performant implementation:

        int end = offset + count;

        // Pass 1: Compute precise size of char[]
        int n = 0;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBMPCodePoint(c))
                n += 1;
            else if (Character.isSupplementaryCodePoint(c))
                n += 2;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        char[] v = new char[n];
        for (int i = offset, j = 0; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBMPCodePoint(c)) {
                v[j++] = (char) c;
            } else {
                Character.toSurrogates(c, v, j);
                j += 2;
            }
        }

Martin


From Ulf.Zibis at gmx.de  Tue Mar 16 21:51:55 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 16 Mar 2010 22:51:55 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003161357p50ab32delb260dd8f16651915@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8EB46C.1010208@sun.com>	
	<4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>	
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>	
	<4B9FE077.3060608@gmx.de>
	<1ccfd1c11003161357p50ab32delb260dd8f16651915@mail.gmail.com>
Message-ID: <4B9FFD7B.2070705@gmx.de>

Am 16.03.2010 21:57, schrieb Martin Buchholz:
> On Tue, Mar 16, 2010 at 12:48, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>>>> - Same shift magic would enhance isISOControl(), isHighSurrogate(),
>>>> isLowSurrogate(), in particular if latter occur consecutive.
>>>>
>>>>          
>>> isISOControl - yes, others - I am not convinced.
>>>
>>>        
>> If virtually shifted by 8, HotSpot could use cheaper 1-byte compare on the
>> high byte.
>> Additionally, those methods are often used consecutively, so all 4 compares
>> would benefit from.
>>
>>      
>>>>   8-bit shift + compare would allow HotSpot to compile to smart 1-byte
>>>> immediate op-codes.
>>>> In encodeBufferLoop() you could use putChar(), putInt() instead put().
>>>> Should perform better.
>>>>
>>>>          
>>> I'm not convinced.  You would need to assemble bytes into an
>>> int, and then break them apart into bytes on the other side?
>>>
>>>        
>> Some time ago, I disassembled such code. I could see, that the int was
>> copied directly to memory by one 32-bit move instruction.
>> In case of using put(byte), I saw 4 8-bit move instructions.
>>      
> Ulf, I'd like to understand this better.
>
> How are you generating the machine code
> (pointer to docs?)?
>    

I must prepare it. Takes some time.

> Bits.java is doing byte-oriented put instructions in any case.
> If the VM can optimize putInt, it should be able to optimize
> the equivalent series of put(byte) as well, no?
>    

Yes, it should, but it doesn't.

> Can you provide a small patch that gives an observable
> performance improvement in a micro-benchmark?
>    

I'll try.

>    
>> I not have dissassembled if a 3-byte value first would be collected in a
>> 3-byte byte[] and then copied by put(byte[]). Maybe HotSpot could optimize
>> here too.
>>
>> Try it out. 2 will see more than 1. Maybe I was in error.
>>
>> BTW: for the same optimization, I would like to have putInt() and putLong()
>> in Charbuffer, ShortBuffer and for the latter in IntBuffer.
>>      
> Perhaps better to get the VM to optimize a series of puts into
> a single instruction?
>    

I have such an RFE in mind.

-Ulf


From martinrb at google.com  Tue Mar 16 22:00:34 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 16 Mar 2010 15:00:34 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA007A4.2030907@sun.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com>
Message-ID: <1ccfd1c11003161500o3d41e3felb2ab619f27095082@mail.gmail.com>

I am recanting my previous support for any change to
isSupplementaryCodePoint.

I think my brain (or maybe Ulf's brain)
tricked me into thinking that
the considerations for isValidCodePoint and
isBMPCodePoint also apply to
isSupplementaryCodePoint.

Sorry.

I renamed my patch file from isSupplementaryCodePoint to isValidCodePoint.

6934268: Better implementation of Character.isValidCodePoint
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isValidCodePoint
6934265: Add public method Character.isBMPCodePoint
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
6934270: Remove javac warnings from Character.java
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings
6934271: Better handling of longer utf-8 sequences
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/utf8-twiddling
6935172: Optimize bit-twiddling in Bits.java
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Bits.java

Martin

On Tue, Mar 16, 2010 at 15:35, Xueming Shen <Xueming.Shen at sun.com> wrote:
> Martin Buchholz wrote:
>>
>> On Tue, Mar 16, 2010 at 13:06, Xueming Shen <Xueming.Shen at sun.com> wrote:
>>
>>>
>>> Martin Buchholz wrote:
>>>
>>>>
>>>> Therefore the existing implementation
>>>>
>>>>>>
>>>>>> ?return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>> ? ? ? ? ? && ?codePoint<= MAX_CODE_POINT;
>>>>>>
>>>>>> will almost always perform just one comparison against a constant,
>>>>>> which is hard to beat.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>> against
>>>>> only 8 bit values in my case.
>>>>>
>>>>>
>>>>
>>>> It's a good point. ?In the machine code, shifts are likely to use
>>>> immediate values, and so will be a small win.
>>>>
>>>> int x = codePoint >>> 16;
>>>> return x != 0 && x < 0x11;
>>>>
>>>> (On modern hardware, these optimizations
>>>> are less valuable than they used to be;
>>>> ordinary integer arithmetic is almost free)
>>>>
>>>>
>>>>
>>>
>>> I'm not convinced if the proposed code is really better...a "small win".
>>>
>>
>> The primary theory here is that branches are expensive,
>> and we are reducing them by one.
>>
>>
>
> There are still two branches in new impl, if you count the "ifeq" and
> "if_icmpge"(?)
>
> We are trying to "optimize" this piece of code with the assumption that the
> new impl MIGHT help certain vm (hotspot?)
> to optimize certain use scenario (some consecutive usages), if the compiler
> and/or the vm are both smart enough at certain
> point, with no supporting benchmark data?
>
> My concern is that the reality might be that this optimization might even
> hurt the BMP use
> case (the majority of the possible real world use scenarios) with a 10%
> bigger bytecode size.
>
> -Sherman
>
>
>
> public class Character extends java.lang.Object {
> ?public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>
> ?public static final int MAX_CODE_POINT = 1114111;
>
> ?public Character();
> ? Code:
> ? ? ?0: aload_0 ? ? ? ? ? ? 1: invokespecial #1 ? ? ? ? ? ? ? ? ?// Method
> java/lang/Object."<init>":()V
> ? ? ?4: return
> ?public static boolean isSupplementaryCodePoint(int);
> ? Code:
> ? ? ?0: iload_0 ? ? ? ? ? ? 1: ldc ? ? ? ? ? #2 ? ? ? ? ? ? ? ? ?// int
> 65536
> ? ? ?3: if_icmplt ? ? 16
> ? ? ?6: iload_0 ? ? ? ? ? ? 7: ldc ? ? ? ? ? #3 ? ? ? ? ? ? ? ? ?// int
> 1114111
> ? ? ?9: if_icmpgt ? ? 16
> ? ? 12: iconst_1 ? ? ? ? ? 13: goto ? ? ? ? ?17
> ? ? 16: iconst_0 ? ? ? ? ? 17: ireturn
> ?public static boolean isSupplementaryCodePoint_new(int);
> ? Code:
> ? ? ?0: iload_0 ? ? ? ? ? ? 1: bipush ? ? ? ?16
> ? ? ?3: iushr ? ? ? ? ? ? ? 4: istore_1
> ? ? ?5: iload_1 ? ? ? ? ? ? 6: ifeq ? ? ? ? ?19
> ? ? ?9: iload_1 ? ? ? ? ? ?10: bipush ? ? ? ?17
> ? ? 12: if_icmpge ? ? 19
> ? ? 15: iconst_1 ? ? ? ? ? 16: goto ? ? ? ? ?20
> ? ? 19: iconst_0 ? ? ? ? ? 20: ireturn ? ? ? }
>
>
>
>
>
>


From Ulf.Zibis at gmx.de  Tue Mar 16 22:25:40 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 16 Mar 2010 23:25:40 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4BA007A4.2030907@sun.com>
References: <4A95079A.8080803@gmx.de>
	<4A9578C4.8060801@sun.com>	<4B8DA070.3040306@gmx.de>	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	<4B8E3DA3.7090902@gmx.de>	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	<4B9FE4DD.1090405@sun.com>	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com>
Message-ID: <4BA00564.4010104@gmx.de>

Am 16.03.2010 23:35, schrieb Xueming Shen:
> Martin Buchholz wrote:
>> On Tue, Mar 16, 2010 at 13:06, Xueming Shen <Xueming.Shen at sun.com> 
>> wrote:
>>> Martin Buchholz wrote:
>>>> Therefore the existing implementation
>>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>> &&  codePoint<= MAX_CODE_POINT;
>>>>>>
>>>>>> will almost always perform just one comparison against a constant,
>>>>>> which is hard to beat.
>>>>>>
>>>>>>
>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>> against
>>>>> only 8 bit values in my case.
>>>>>
>>>> It's a good point.  In the machine code, shifts are likely to use
>>>> immediate values, and so will be a small win.
>>>>
>>>> int x = codePoint >>> 16;
>>>> return x != 0 && x < 0x11;
>>>>
>>>> (On modern hardware, these optimizations
>>>> are less valuable than they used to be;
>>>> ordinary integer arithmetic is almost free)
>>>>
>>>>
>>> I'm not convinced if the proposed code is really better...a "small 
>>> win".
>>
>> The primary theory here is that branches are expensive,
>> and we are reducing them by one.
>>
>
> There are still two branches in new impl, if you count the "ifeq" and 
> "if_icmpge"(?)

True.

But
         for (int i = offset; i < offset + count; i++) {
             int c = codePoints[i];
             byte plane = (byte)(c >>> 16);
             if (plane == 0)
                 n += 1;
             else if (plane >= 0 && plane <= (byte)0x11)
                 n += 2;
             else throw new IllegalArgumentException(Integer.toString(c));
         }
has too only 2 branches if 6932837 would be fixed, 3 otherwise, and 
additionally could benefit from tiny 8-bit comparisons.
The shift additionally could be omitted on CPU's which can benefit from 
6933327.
Instead:
         for (int i = offset; i < offset + count; i++) {
             int c = codePoints[i];
             if (c >= Character.MIN_VALUE &&
                 c <=  Character.MAX_VALUE)
                 n += 1;
             else if (c >= Character.MIN_SUPPLEMENTARY_CODE_POINT &&
                 c <=  Character.MAX_SUPPLEMENTARY_CODE_POINT)
                 n += 2;
             else throw new IllegalArgumentException(Integer.toString(c));
         }
needs 4 branches and 4 32-bit comparisons.


>
> We are trying to "optimize" this piece of code with the assumption 
> that the new impl MIGHT help certain vm (hotspot?)
> to optimize certain use scenario (some consecutive usages), if the 
> compiler and/or the vm are both smart enough at certain
> point, with no supporting benchmark data?
>
> My concern is that the reality might be that this optimization might 
> even hurt the BMP use
> case (the majority of the possible real world use scenarios) with a 
> 10% bigger bytecode size.
>
> -Sherman
>
>
>
> public class Character extends java.lang.Object {
>  public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>
>  public static final int MAX_CODE_POINT = 1114111;
>
>  public Character();
>    Code:
>       0: aload_0             1: invokespecial #1                  // 
> Method java/lang/Object."<init>":()V
>       4: return
>  public static boolean isSupplementaryCodePoint(int);
>    Code:
>       0: iload_0             1: ldc           #2                  // 
> int 65536
>       3: if_icmplt     16
>       6: iload_0             7: ldc           #3                  // 
> int 1114111
>       9: if_icmpgt     16
>      12: iconst_1           13: goto          17
>      16: iconst_0           17: ireturn
>  public static boolean isSupplementaryCodePoint_new(int);
>    Code:
>       0: iload_0             1: bipush        16
>       3: iushr               4: istore_1
>       5: iload_1             6: ifeq          19
>       9: iload_1            10: bipush        17
>      12: if_icmpge     19
>      15: iconst_1           16: goto          20
>      19: iconst_0           20: ireturn       }
>
>
>
>
>
>


From Ulf.Zibis at gmx.de  Tue Mar 16 23:14:08 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 00:14:08 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003161436o191295f2r9464d715488cb16d@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>	
	<4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com> <4B9FE5A0.7010409@gmx.de>	
	<4B9FF887.6080402@sun.com> <4B9FF0F9.4090606@gmx.de>
	<1ccfd1c11003161436o191295f2r9464d715488cb16d@mail.gmail.com>
Message-ID: <4BA010C0.6020204@gmx.de>

Am 16.03.2010 22:36, schrieb Martin Buchholz:
> On Tue, Mar 16, 2010 at 13:58, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>
>    
>> Additionally, toUpperCaseCharArray(), codePointCountImpl(), String(int[],
>> int, int) would profit from consecutive use of isBMPCodePoint +
>> isSupplementaryCodePoint() or isHighSurrogate() + isLowSurrogate.
>>      
> For codePointCountImpl(), I do not agree.
>    

1-byte comparisons have less footprint, in doubt load faster from 
memory, need less L1-CPU-cache, on small/RISC/etc. CPU's would be faster 
and therefore should enhance overall performance.
The shift additionally could be omitted on CPU's which can benefit from 
6933327.

> For String(int[], int, int), I do agree.
>
> Here is my latest more readable and more performant implementation:
>
>          int end = offset + count;
>
>          // Pass 1: Compute precise size of char[]
>          int n = 0;
>          for (int i = offset; i<  end; i++) {
>              int c = codePoints[i];
>              if (Character.isBMPCodePoint(c))
>                  n += 1;
>              else if (Character.isSupplementaryCodePoint(c))
>                  n += 2;
>              else throw new IllegalArgumentException(Integer.toString(c));
>          }
>
>          // Pass 2: Allocate and fill in char[]
>          char[] v = new char[n];
>          for (int i = offset, j = 0; i<  end; i++) {
>              int c = codePoints[i];
>              if (Character.isBMPCodePoint(c)) {
>                  v[j++] = (char) c;
>              } else {
>                  Character.toSurrogates(c, v, j);
>                  j += 2;
>              }
>          }
>    

I suggest:

         // Pass 2: Allocate and fill in char[]
         char[] v = new char[n];
         for (int i = end; n > 0; ) {
             int c = codePoints[--i];
             if (Character.isBMPCodePoint(c))
                 v[--n] = (char)c;
             else
                 Character.toSurrogates(c, v, n -= 2);
         }

- saves 1 variable (=reduces register pressure)
- determining of the loop end against 0 is faster than against "end", 
see: 6932855 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6932855>
BTW:
     int end = offset + count;
could be saved, as VM would do that, for sure in HotSpot c2 compiler.

-Ulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100317/01d9992e/attachment.html>

From martinrb at google.com  Tue Mar 16 23:41:08 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 16 Mar 2010 16:41:08 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA010C0.6020204@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com> <4B9FE5A0.7010409@gmx.de>
	<4B9FF887.6080402@sun.com> <4B9FF0F9.4090606@gmx.de>
	<1ccfd1c11003161436o191295f2r9464d715488cb16d@mail.gmail.com>
	<4BA010C0.6020204@gmx.de>
Message-ID: <1ccfd1c11003161641o202711aerad4e686d21a2cf53@mail.gmail.com>

On Tue, Mar 16, 2010 at 16:14, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 16.03.2010 22:36, schrieb Martin Buchholz:
>
> On Tue, Mar 16, 2010 at 13:58, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>
>
>
> Additionally, toUpperCaseCharArray(), codePointCountImpl(), String(int[],
> int, int) would profit from consecutive use of isBMPCodePoint +
> isSupplementaryCodePoint() or isHighSurrogate() + isLowSurrogate.
>
>
> For codePointCountImpl(), I do not agree.
>
>
> 1-byte comparisons have less footprint, in doubt load faster from memory,
> need less L1-CPU-cache, on small/RISC/etc. CPU's would be faster and
> therefore should enhance overall performance.
> The shift additionally could be omitted on CPU's which can benefit from
> 6933327.

I am not convinced.  Using byte for local variables is unlikely to
give any performance benefit.  The only way use of byte can be
a win is if you read/write a bunch of them at once from memory.
I think of byte as a compression scheme for int.

> For String(int[], int, int), I do agree.
>
> Here is my latest more readable and more performant implementation:
>
>         int end = offset + count;
>
>         // Pass 1: Compute precise size of char[]
>         int n = 0;
>         for (int i = offset; i < end; i++) {
>             int c = codePoints[i];
>             if (Character.isBMPCodePoint(c))
>                 n += 1;
>             else if (Character.isSupplementaryCodePoint(c))
>                 n += 2;
>             else throw new IllegalArgumentException(Integer.toString(c));
>         }
>
>         // Pass 2: Allocate and fill in char[]
>         char[] v = new char[n];
>         for (int i = offset, j = 0; i < end; i++) {
>             int c = codePoints[i];
>             if (Character.isBMPCodePoint(c)) {
>                 v[j++] = (char) c;
>             } else {
>                 Character.toSurrogates(c, v, j);
>                 j += 2;
>             }
>         }
>
>
> I suggest:
>
> ??????? // Pass 2: Allocate and fill in char[]
> ??????? char[] v = new char[n];
> ??????? for (int i = end; n > 0; ) {
> ??????????? int c = codePoints[--i];
> ??????????? if (Character.isBMPCodePoint(c))
> ??????????????? v[--n] = (char)c;
> ??????????? else
> ??????????????? Character.toSurrogates(c, v, n -= 2);
> ??????? }
>
> - saves 1 variable (=reduces register pressure)
> - determining of the loop end against 0 is faster than against "end", see:
> 6932855

Perhaps, but this exceeds my micro-optimization threshold.

> BTW:
> ??? int end = offset + count;
> could be saved, as VM would do that, for sure in HotSpot c2 compiler.
>
> -Ulf
>
>

Martin


From Ulf.Zibis at gmx.de  Wed Mar 17 00:46:54 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 01:46:54 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003161641o202711aerad4e686d21a2cf53@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com> <4B9FE5A0.7010409@gmx.de>	
	<4B9FF887.6080402@sun.com> <4B9FF0F9.4090606@gmx.de>	
	<1ccfd1c11003161436o191295f2r9464d715488cb16d@mail.gmail.com>	
	<4BA010C0.6020204@gmx.de>
	<1ccfd1c11003161641o202711aerad4e686d21a2cf53@mail.gmail.com>
Message-ID: <4BA0267E.7080009@gmx.de>

Am 17.03.2010 00:41, schrieb Martin Buchholz:
> On Tue, Mar 16, 2010 at 16:14, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 16.03.2010 22:36, schrieb Martin Buchholz:
>>
>> On Tue, Mar 16, 2010 at 13:58, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>>
>>
>>
>> Additionally, toUpperCaseCharArray(), codePointCountImpl(), String(int[],
>> int, int) would profit from consecutive use of isBMPCodePoint +
>> isSupplementaryCodePoint() or isHighSurrogate() + isLowSurrogate.
>>
>>
>> For codePointCountImpl(), I do not agree.
>>
>>
>> 1-byte comparisons have less footprint, in doubt load faster from memory,
>> need less L1-CPU-cache, on small/RISC/etc. CPU's would be faster and
>> therefore should enhance overall performance.
>> The shift additionally could be omitted on CPU's which can benefit from
>> 6933327.
>>      

1) I agree, this is academical.
2) should better be optimized by VM, but isn't at this time see:
Just filed, no ID yet: - Transform comparisons against odd border to 
even border
(Review ID: 1735166) - Use as less bits as necessary
3) didn't you say, we should write code without referring on VM vendor 
specific optimizations

4) Regardless the 8-bit/32-bit arguments, if we subtract 0xd800/0xdc00, 
I guess, we could benefit from 6932837 
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6932837> - Better 
use unsigned jump if one of the range limits is 0
         for (int i = offset; i < endIndex; ) {
             n++;
             byte highByte = (byte)((a[i++] >>> 8) - 0xd8);
             if (highByte >= 0 && highByte < 0x4) {
                 if (i < endIndex && (highByte = (byte)((a[i] >>> 8) - 
0xdc)) >= 0 && highByte < 0x4) {
                     i++;
                 }
             }
         }


> I am not convinced.  Using byte for local variables is unlikely to
> give any performance benefit.  The only way use of byte can be
> a win is if you read/write a bunch of them at once from memory.
> I think of byte as a compression scheme for int.
>
>    
>> For String(int[], int, int), I do agree.
>>
>> Here is my latest more readable and more performant implementation:
>>
>>          int end = offset + count;
>>
>>          // Pass 1: Compute precise size of char[]
>>          int n = 0;
>>          for (int i = offset; i<  end; i++) {
>>              int c = codePoints[i];
>>              if (Character.isBMPCodePoint(c))
>>                  n += 1;
>>              else if (Character.isSupplementaryCodePoint(c))
>>                  n += 2;
>>              else throw new IllegalArgumentException(Integer.toString(c));
>>          }
>>
>>          // Pass 2: Allocate and fill in char[]
>>          char[] v = new char[n];
>>          for (int i = offset, j = 0; i<  end; i++) {
>>              int c = codePoints[i];
>>              if (Character.isBMPCodePoint(c)) {
>>                  v[j++] = (char) c;
>>              } else {
>>                  Character.toSurrogates(c, v, j);
>>                  j += 2;
>>              }
>>          }
>>
>>
>> I suggest:
>>
>>          // Pass 2: Allocate and fill in char[]
>>          char[] v = new char[n];
>>          for (int i = end; n>  0; ) {
>>              int c = codePoints[--i];
>>              if (Character.isBMPCodePoint(c))
>>                  v[--n] = (char)c;
>>              else
>>                  Character.toSurrogates(c, v, n -= 2);
>>          }
>>
>> - saves 1 variable (=reduces register pressure)
>> - determining of the loop end against 0 is faster than against "end", see:
>> 6932855
>>      
> Perhaps, but this exceeds my micro-optimization threshold.
>    

:-(

-Ulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100317/606db77c/attachment.html>

From Ulf.Zibis at gmx.de  Wed Mar 17 01:14:36 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 02:14:36 +0100
Subject: request for paired constants in j.l.Character
Message-ID: <4BA02CFC.8010907@gmx.de>

In java.lang.Character we have:
     public static final char MIN_VALUE = '\u0000';
     public static final char MAX_VALUE = '\uFFFF';
     public static final int MIN_CODE_POINT = 0x000000;
     public static final int MAX_CODE_POINT = 0X10FFFF;
     public static final int MIN_SUPPLEMENTARY_CODE_POINT = MAX_VALUE + 1;

As we have MIN_CODE_POINT, which is duplicate of MIN_VALUE, IMO we 
additionally could have
     public static final int MAX_SUPPLEMENTARY_CODE_POINT = MAX_CODE_POINT;

It would look better and serve plenty users expectations to find those 
MIN/MAX constants as pair.

Is there anybody who agrees with me ?

-Ulf


From Ulf.Zibis at gmx.de  Wed Mar 17 01:20:35 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 02:20:35 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003161346o3496bc39gfa40583abd6bb8c9@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8EB46C.1010208@sun.com>	
	<4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995F9C.3070705@sun.com>	
	<1ccfd1c11003121529r22651bfcnfca6435311d707a6@mail.gmail.com>	
	<4B9FE999.3050106@gmx.de>
	<1ccfd1c11003161346o3496bc39gfa40583abd6bb8c9@mail.gmail.com>
Message-ID: <4BA02E63.306@gmx.de>

Am 16.03.2010 21:46, schrieb Martin Buchholz:
> On Tue, Mar 16, 2010 at 13:27, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 13.03.2010 00:29, schrieb Martin Buchholz:
>>
>> Won't you like to add:
>>      *<p><b>Note:</b>  In combination with {@link #isBMPCodePoint(int)} this
>>      * method should be in 2nd place to permit additional HotSpot compiler
>>      * optimization. Example:
>>      *<blockquote><pre>
>>      *     if (Character.isBMPCodePoint(codePoint))
>>      *         ...;
>>      *     else if (Character.isSupplementaryCodePoint(codePoint))
>>      *         ...;
>>      *     else
>>      *         ...;
>>      *</pre></blockquote>
>>      *
>>      
> No.
>
> This kind of implementation-specific comment is not
> traditionally put in public javadoc (it's considered OK
> in private comments).  Also, we should not inflict our
> dangerous micro-optimization disease on others.
>    

Hm, I believe I've seen similar things in javadoc, but I can agree 
dropping the <blockqoute> part.
Would "In combination with {@link #isBMPCodePoint(int)} this method 
should be in 2nd place to permit additional VM optimization." be better?

-Ulf


From weijun.wang at sun.com  Wed Mar 17 01:55:37 2010
From: weijun.wang at sun.com (weijun.wang at sun.com)
Date: Wed, 17 Mar 2010 01:55:37 +0000
Subject: hg: jdk7/tl/jdk: 6868865: Test:
	sun/security/tools/jarsigner/oldsig.sh fails under all platforms
Message-ID: <20100317015556.BC14844E6E@hg.openjdk.java.net>

Changeset: 0500f7306cbe
Author:    weijun
Date:      2010-03-17 09:55 +0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/0500f7306cbe

6868865: Test: sun/security/tools/jarsigner/oldsig.sh fails under all platforms
Reviewed-by: wetmore

! test/sun/security/tools/jarsigner/oldsig.sh


From Ulf.Zibis at gmx.de  Wed Mar 17 08:36:08 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 09:36:08 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4BA00564.4010104@gmx.de>
References: <4A95079A.8080803@gmx.de>	<4A9578C4.8060801@sun.com>	<4B8DA070.3040306@gmx.de>	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	<4B8E3DA3.7090902@gmx.de>	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	<4B9FE4DD.1090405@sun.com>	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	<4BA007A4.2030907@sun.com>
	<4BA00564.4010104@gmx.de>
Message-ID: <4BA09478.90809@gmx.de>

Oops, correction:

But
         for (int i = offset; i < offset + count; i++) {
             int c = codePoints[i];
             byte plane = (byte)(c >>> 16);
             if (plane == 0)
                 n += 1;
             else if (plane <= (byte)0x11)
                 n += 2;
             else throw new IllegalArgumentException(Integer.toString(c));
         }
has too only 2 branches and additionally could benefit from tiny 8-bit 
comparisons.
The shift additionally could be omitted on CPU's which can benefit from 
6933327.

Instead:
         for (int i = offset; i < offset + count; i++) {
             int c = codePoints[i];
             if (c >= Character.MIN_VALUE &&
                 c <=  Character.MAX_VALUE)
                 n += 1;
             else if (c >= Character.MIN_SUPPLEMENTARY_CODE_POINT &&
                 c <=  Character.MAX_SUPPLEMENTARY_CODE_POINT)
                 n += 2;
             else throw new IllegalArgumentException(Integer.toString(c));
         }
needs 4 branches and 4 32-bit comparisons.

-Ulf


From Ulf.Zibis at gmx.de  Wed Mar 17 09:11:56 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 10:11:56 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4BA09478.90809@gmx.de>
References: <4A95079A.8080803@gmx.de>	<4A9578C4.8060801@sun.com>	<4B8DA070.3040306@gmx.de>	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	<4B8E3DA3.7090902@gmx.de>	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	<4B9FE4DD.1090405@sun.com>	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	<4BA007A4.2030907@sun.com>	<4BA00564.4010104@gmx.de>
	<4BA09478.90809@gmx.de>
Message-ID: <4BA09CDC.1040402@gmx.de>

Am I mad ???

2nd. correction:

But
         for (int i = offset; i < offset + count; i++) {
             int c = codePoints[i];
             char plane = (char)(c >>> 16);
             if (plane == 0)
                 n += 1;
             else if (plane < 0x11)
                 n += 2;
             else throw new IllegalArgumentException(Integer.toString(c));
         }
has too only 2 branches and additionally could benefit from tiny 16-bit 
comparisons.
The shift additionally could be omitted on CPU's which can benefit from 
6933327.

Instead:
         for (int i = offset; i < offset + count; i++) {
             int c = codePoints[i];
             if (c >= Character.MIN_VALUE &&
                 c <=  Character.MAX_VALUE)
                 n += 1;
             else if (c >= Character.MIN_SUPPLEMENTARY_CODE_POINT &&
                 c <=  Character.MAX_SUPPLEMENTARY_CODE_POINT)
                 n += 2;
             else throw new IllegalArgumentException(Integer.toString(c));
         }
needs 4 branches and 4 32-bit comparisons.

-Ulf


From martinrb at google.com  Wed Mar 17 15:46:42 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 17 Mar 2010 07:46:42 -0800
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA09CDC.1040402@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA00564.4010104@gmx.de>
	<4BA09478.90809@gmx.de> <4BA09CDC.1040402@gmx.de>
Message-ID: <1ccfd1c11003170846n19c3e273v71daff3a755c58a4@mail.gmail.com>

On Wed, Mar 17, 2010 at 01:11, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am I mad ???
>
> 2nd. correction:
>
> But
> ? ? ? ?for (int i = offset; i < offset + count; i++) {
> ? ? ? ? ? ?int c = codePoints[i];
> ? ? ? ? ? ?char plane = (char)(c >>> 16);
> ? ? ? ? ? ?if (plane == 0)
> ? ? ? ? ? ? ? ?n += 1;
> ? ? ? ? ? ?else if (plane < 0x11)
> ? ? ? ? ? ? ? ?n += 2;
> ? ? ? ? ? ?else throw new IllegalArgumentException(Integer.toString(c));
> ? ? ? ?}
> has too only 2 branches and additionally could benefit from tiny 16-bit
> comparisons.
> The shift additionally could be omitted on CPU's which can benefit from
> 6933327.

I'm not a x86 or hotspot expert, but I would think that the "plane"
variable is never written to memory, but lives only in a register,
so I see only drawbacks to making plane a "char".

Martin


From Xueming.Shen at Sun.COM  Wed Mar 17 17:05:48 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Wed, 17 Mar 2010 09:05:48 -0800
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003170846n19c3e273v71daff3a755c58a4@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA00564.4010104@gmx.de>
	<4BA09478.90809@gmx.de> <4BA09CDC.1040402@gmx.de>
	<1ccfd1c11003170846n19c3e273v71daff3a755c58a4@mail.gmail.com>
Message-ID: <4BA10BEC.8050105@sun.com>

Martin Buchholz wrote:
> On Wed, Mar 17, 2010 at 01:11, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>   
>> Am I mad ???
>>
>> 2nd. correction:
>>
>> But
>>        for (int i = offset; i < offset + count; i++) {
>>            int c = codePoints[i];
>>            char plane = (char)(c >>> 16);
>>            if (plane == 0)
>>                n += 1;
>>            else if (plane < 0x11)
>>                n += 2;
>>            else throw new IllegalArgumentException(Integer.toString(c));
>>        }
>> has too only 2 branches and additionally could benefit from tiny 16-bit
>> comparisons.
>> The shift additionally could be omitted on CPU's which can benefit from
>> 6933327.
>>     
>
> I'm not a x86 or hotspot expert, but I would think that the "plane"
> variable is never written to memory, but lives only in a register,
> so I see only drawbacks to making plane a "char".
>
>   
I doubt there is any benefit to use a 8-bit or 16-bit operand on a 
32-bit/64-bit machine.
While optimization is definitely good, but it might not be a good habit 
to code in high-level
program language while thinking in assembly every each minute:-)  let's 
leave those
optimization to hotspot engineer:-)
In this particular case, given most application will never use 
supplementary character, I
doubt it really worth the optimization and I would definitely not try to 
change the impl
of isSupplementaryCP to make this code "better". If you really really 
want to optimize
this code the alternative is to have a package private 
Character.getPlane(), or simply
to use your optimized code above. I would suggest to use int for plane, 
instead of char or
byte.

-Sherman


From Ulf.Zibis at gmx.de  Wed Mar 17 16:29:27 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 17:29:27 +0100
Subject: 2 Questions on StringBuffer
Message-ID: <4BA10367.5060307@gmx.de>

Why there are 2 methods which do not use the super method, where I can't 
see any difference? :

     public synchronized char charAt(int index)
     public synchronized void setCharAt(int index, char ch)

Wouldn't ensureCapacity better coded as follows? :
     public void ensureCapacity(int minimumCapacity) {
         if (minimumCapacity > value.length) synchronized {
             ensureCapacity(minimumCapacity);
         }
     }
This would save the synchronization if there is nothing to do.

-Ulf


From forax at univ-mlv.fr  Wed Mar 17 16:36:41 2010
From: forax at univ-mlv.fr (=?UTF-8?B?UsOpbWkgRm9yYXg=?=)
Date: Wed, 17 Mar 2010 17:36:41 +0100
Subject: 2 Questions on StringBuffer
In-Reply-To: <4BA10367.5060307@gmx.de>
References: <4BA10367.5060307@gmx.de>
Message-ID: <4BA10519.2090509@univ-mlv.fr>

Le 17/03/2010 17:29, Ulf Zibis a ?crit :
> Why there are 2 methods which do not use the super method, where I 
> can't see any difference? :
>
>     public synchronized char charAt(int index)
>     public synchronized void setCharAt(int index, char ch)
>
> Wouldn't ensureCapacity better coded as follows? :
>     public void ensureCapacity(int minimumCapacity) {
>         if (minimumCapacity > value.length) synchronized {
>             ensureCapacity(minimumCapacity);
>         }
>     }
> This would save the synchronization if there is nothing to do.
>
> -Ulf
>
>
>

no, it doesn't work.
if some variables are not in the synchronized block,
they can be updated by one thread but this change will be not visible in 
another thread.

R?mi


From Ulf.Zibis at gmx.de  Wed Mar 17 17:01:08 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 18:01:08 +0100
Subject: 2 Questions on StringBuffer
In-Reply-To: <4BA10519.2090509@univ-mlv.fr>
References: <4BA10367.5060307@gmx.de> <4BA10519.2090509@univ-mlv.fr>
Message-ID: <4BA10AD4.9010602@gmx.de>

Am 17.03.2010 17:36, schrieb R?mi Forax:
> Le 17/03/2010 17:29, Ulf Zibis a ?crit :
>> Why there are 2 methods which do not use the super method, where I 
>> can't see any difference? :
>>
>>     public synchronized char charAt(int index)
>>     public synchronized void setCharAt(int index, char ch)
>>
>> Wouldn't ensureCapacity better coded as follows? :
>>     public void ensureCapacity(int minimumCapacity) {
>>         if (minimumCapacity > value.length) synchronized {
>>             ensureCapacity(minimumCapacity);
>>         }
>>     }
>> This would save the synchronization if there is nothing to do.
>>
>> -Ulf
>>
>>
>>
>
> no, it doesn't work.
> if some variables are not in the synchronized block,
> they can be updated by one thread but this change will be not visible 
> in another thread.

Hm, those values are checked again in the super.ensureCapacity(), so 
inside the synchronized block.

I guess this is the answer on my 2nd question, thanks.
Please excuse little typo, I meant:
             super.ensureCapacity(minimumCapacity);

-Ulf


From martinrb at google.com  Wed Mar 17 17:41:12 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 17 Mar 2010 09:41:12 -0800
Subject: 2 Questions on StringBuffer
In-Reply-To: <4BA10367.5060307@gmx.de>
References: <4BA10367.5060307@gmx.de>
Message-ID: <1ccfd1c11003171041y5ed161efhd4d4b39716e7e2db@mail.gmail.com>

On Wed, Mar 17, 2010 at 08:29, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Why there are 2 methods which do not use the super method, where I can't see
> any difference? :
>
> ? ?public synchronized char charAt(int index)
> ? ?public synchronized void setCharAt(int index, char ch)

You're correct that these methods
could be refactored to call super ("DRY"),
but the code duplication is small,
and these methods are performance-critical,
so let's just leave them as is.

Martin


From Ulf.Zibis at gmx.de  Wed Mar 17 18:02:13 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 19:02:13 +0100
Subject: 2 Questions on StringBuffer
In-Reply-To: <1ccfd1c11003171041y5ed161efhd4d4b39716e7e2db@mail.gmail.com>
References: <4BA10367.5060307@gmx.de>
	<1ccfd1c11003171041y5ed161efhd4d4b39716e7e2db@mail.gmail.com>
Message-ID: <4BA11925.7080702@gmx.de>

Am 17.03.2010 18:41, schrieb Martin Buchholz:
> On Wed, Mar 17, 2010 at 08:29, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Why there are 2 methods which do not use the super method, where I can't see
>> any difference? :
>>
>>     public synchronized char charAt(int index)
>>     public synchronized void setCharAt(int index, char ch)
>>      
> You're correct that these methods
> could be refactored to call super ("DRY"),
> but the code duplication is small,
> and these methods are performance-critical,
> so let's just leave them as is.
>    

Additionally I think, there's a bug in javadoc of those methods.
Actually they throw StringIndexOutOfBoundsException.

-Ulf


From martinrb at google.com  Wed Mar 17 19:12:41 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 17 Mar 2010 11:12:41 -0800
Subject: 2 Questions on StringBuffer
In-Reply-To: <4BA11925.7080702@gmx.de>
References: <4BA10367.5060307@gmx.de>
	<1ccfd1c11003171041y5ed161efhd4d4b39716e7e2db@mail.gmail.com>
	<4BA11925.7080702@gmx.de>
Message-ID: <1ccfd1c11003171212r199a0d7bn49afe7f570f767ee@mail.gmail.com>

On Wed, Mar 17, 2010 at 10:02, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 17.03.2010 18:41, schrieb Martin Buchholz:
>>
>> On Wed, Mar 17, 2010 at 08:29, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>
>>>
>>> Why there are 2 methods which do not use the super method, where I can't
>>> see
>>> any difference? :
>>>
>>> ? ?public synchronized char charAt(int index)
>>> ? ?public synchronized void setCharAt(int index, char ch)
>>>
>>
>> You're correct that these methods
>> could be refactored to call super ("DRY"),
>> but the code duplication is small,
>> and these methods are performance-critical,
>> so let's just leave them as is.
>>
>
> Additionally I think, there's a bug in javadoc of those methods.
> Actually they throw StringIndexOutOfBoundsException.

Why would that be a bug?

Martin


From martinrb at google.com  Wed Mar 17 21:16:36 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 17 Mar 2010 13:16:36 -0800
Subject: request for paired constants in j.l.Character
In-Reply-To: <4BA02CFC.8010907@gmx.de>
References: <4BA02CFC.8010907@gmx.de>
Message-ID: <1ccfd1c11003171416q32a2820bh3cfb1f0fc93923f0@mail.gmail.com>

On Tue, Mar 16, 2010 at 17:14, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> In java.lang.Character we have:
> ? ?public static final char MIN_VALUE = '\u0000';
> ? ?public static final char MAX_VALUE = '\uFFFF';
> ? ?public static final int MIN_CODE_POINT = 0x000000;
> ? ?public static final int MAX_CODE_POINT = 0X10FFFF;
> ? ?public static final int MIN_SUPPLEMENTARY_CODE_POINT = MAX_VALUE + 1;
>
> As we have MIN_CODE_POINT, which is duplicate of MIN_VALUE, IMO we
> additionally could have
> ? ?public static final int MAX_SUPPLEMENTARY_CODE_POINT = MAX_CODE_POINT;
>
> It would look better and serve plenty users expectations to find those
> MIN/MAX constants as pair.
>
> Is there anybody who agrees with me ?

I agree that the symmetry of MIN/MAX pairs
is a good thing to maintain, and so
your suggestion is slightly better than the status quo,
but ... IMO not better enough to actually
justify making any change to the Java Platform API.

Martin


From Ulf.Zibis at gmx.de  Wed Mar 17 21:24:57 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 17 Mar 2010 22:24:57 +0100
Subject: 2 Questions on StringBuffer
In-Reply-To: <1ccfd1c11003171212r199a0d7bn49afe7f570f767ee@mail.gmail.com>
References: <4BA10367.5060307@gmx.de>	
	<1ccfd1c11003171041y5ed161efhd4d4b39716e7e2db@mail.gmail.com>	
	<4BA11925.7080702@gmx.de>
	<1ccfd1c11003171212r199a0d7bn49afe7f570f767ee@mail.gmail.com>
Message-ID: <4BA148A9.7090605@gmx.de>

Am 17.03.2010 20:12, schrieb Martin Buchholz:
> On Wed, Mar 17, 2010 at 10:02, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>>
>> Additionally I think, there's a bug in javadoc of those methods.
>> Actually they throw StringIndexOutOfBoundsException.
>>      
> Why would that be a bug?
>    

I think, javadoc should indicate StringIndexOutOfBoundsException here:

     /**
      * @throws IndexOutOfBoundsException {@inheritDoc}
      * @see        #length()
      */
     public synchronized char charAt(int index) {
         if ((index < 0) || (index >= count))
             throw new StringIndexOutOfBoundsException(index);
         return value[index];
     }


Or am I not enough informed about {@inheritDoc} ?

-Ulf


From martinrb at google.com  Wed Mar 17 22:00:31 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 17 Mar 2010 15:00:31 -0700
Subject: 2 Questions on StringBuffer
In-Reply-To: <4BA148A9.7090605@gmx.de>
References: <4BA10367.5060307@gmx.de>
	<1ccfd1c11003171041y5ed161efhd4d4b39716e7e2db@mail.gmail.com>
	<4BA11925.7080702@gmx.de>
	<1ccfd1c11003171212r199a0d7bn49afe7f570f767ee@mail.gmail.com>
	<4BA148A9.7090605@gmx.de>
Message-ID: <1ccfd1c11003171500x7423cf7cv941a6d87361357c5@mail.gmail.com>

On Wed, Mar 17, 2010 at 14:24, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 17.03.2010 20:12, schrieb Martin Buchholz:
>>
>> On Wed, Mar 17, 2010 at 10:02, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>
>>>
>>> Additionally I think, there's a bug in javadoc of those methods.
>>> Actually they throw StringIndexOutOfBoundsException.
>>>
>>
>> Why would that be a bug?
>>
>
> I think, javadoc should indicate StringIndexOutOfBoundsException here:

That would be an incompatible tightening of the spec.

To understand this, you need to think abstractly about
the specification and implementation of the Java Platform
as two completely separate things.

Martin

> ? ?/**
> ? ? * @throws IndexOutOfBoundsException {@inheritDoc}
> ? ? * @see ? ? ? ?#length()
> ? ? */
> ? ?public synchronized char charAt(int index) {
> ? ? ? ?if ((index < 0) || (index >= count))
> ? ? ? ? ? ?throw new StringIndexOutOfBoundsException(index);
> ? ? ? ?return value[index];
> ? ?}
>
>
> Or am I not enough informed about {@inheritDoc} ?
>
> -Ulf
>
>
>
>


From i30817 at gmail.com  Wed Mar 17 23:59:36 2010
From: i30817 at gmail.com (Paulo Levi)
Date: Wed, 17 Mar 2010 23:59:36 +0000
Subject: Do Set implementations waste memory?
Message-ID: <212322091003171659o7f57afddg21493451887c5b3e@mail.gmail.com>

My understanding is that set implementations are implemented by using Maps
internally + a marker object, and that since Maps are implemented using
arrays of entries this is at least n*3 references more that what is needed,
since there are never multiple values.

Any plans to change this? I suspect it would be a boon for programs that use
the correct data structure.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100317/52a664b8/attachment.html>

From forax at univ-mlv.fr  Thu Mar 18 01:16:50 2010
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Thu, 18 Mar 2010 02:16:50 +0100
Subject: Do Set implementations waste memory?
In-Reply-To: <212322091003171659o7f57afddg21493451887c5b3e@mail.gmail.com>
References: <212322091003171659o7f57afddg21493451887c5b3e@mail.gmail.com>
Message-ID: <4BA17F02.7050906@univ-mlv.fr>

Le 18/03/2010 00:59, Paulo Levi a ?crit :
> My understanding is that set implementations are implemented by using 
> Maps internally + a marker object, and that since Maps are implemented 
> using arrays of entries this is at least n*3 references more that what 
> is needed, since there are never multiple values.
>
> Any plans to change this? I suspect it would be a boon for programs 
> that use the correct data structure.
>

You have to test it.
My guess is that there will be no difference.
As far as I remember, an object needs to be aligned on a valid 64bits 
address even in 32bits mode,
Hotspot uses a 64bits header and the internal hash map entry contains 4 
ints,
if you remove the reference corresponding to the value, the empty place 
will be
considered as garbage and not used.

Else, you can try to remove the internal entry object but in that case
the hashcode of the element will be not stored anymore and you will
have a slowdown for all objects that doesn't cache their hashcode by itself.

R?mi


From jim.andreou at gmail.com  Thu Mar 18 02:07:10 2010
From: jim.andreou at gmail.com (Dimitris Andreou)
Date: Thu, 18 Mar 2010 04:07:10 +0200
Subject: Do Set implementations waste memory?
In-Reply-To: <4BA17F02.7050906@univ-mlv.fr>
References: <212322091003171659o7f57afddg21493451887c5b3e@mail.gmail.com>
	<4BA17F02.7050906@univ-mlv.fr>
Message-ID: <7d7138c11003171907x4038968bwc5ba27e1661c9988@mail.gmail.com>

2010/3/18 R?mi Forax <forax at univ-mlv.fr>

> Le 18/03/2010 00:59, Paulo Levi a ?crit :
>
>  My understanding is that set implementations are implemented by using Maps
>> internally + a marker object, and that since Maps are implemented using
>> arrays of entries this is at least n*3 references more that what is needed,
>> since there are never multiple values.
>>
>> Any plans to change this? I suspect it would be a boon for programs that
>> use the correct data structure.
>>
>>
> You have to test it.
> My guess is that there will be no difference.
> As far as I remember, an object needs to be aligned on a valid 64bits
> address even in 32bits mode,
> Hotspot uses a 64bits header and the internal hash map entry contains 4
> ints,
> if you remove the reference corresponding to the value, the empty place
> will be
> considered as garbage and not used.
>


> Else, you can try to remove the internal entry object but in that case
> the hashcode of the element will be not stored anymore and you will
> have a slowdown for all objects that doesn't cache their hashcode by
> itself.
>
> R?mi
>
>
See my second-to-last post in this thread:
http://groups.google.com/group/guava-discuss/browse_thread/thread/23bc8fa5ae479698

In short, I tested removing the "value" field of a HashMap's entry object,
and indeed (through Instrumentation#getObjectSize) I observed no reduction
in memory. I had to remove one further field (e.g. "hash") to make a
reduction (of 8 bytes per entry).

Dimitris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100318/c9f50bb0/attachment.html>

From weijun.wang at sun.com  Thu Mar 18 10:27:35 2010
From: weijun.wang at sun.com (weijun.wang at sun.com)
Date: Thu, 18 Mar 2010 10:27:35 +0000
Subject: hg: jdk7/tl/jdk: 6829283: HTTP/Negotiate: Autheticator triggered
	again when user cancels the first one
Message-ID: <20100318102854.BFEB644088@hg.openjdk.java.net>

Changeset: 2796f839e337
Author:    weijun
Date:      2010-03-18 18:26 +0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/2796f839e337

6829283: HTTP/Negotiate: Autheticator triggered again when user cancels the first one
Reviewed-by: chegar

! src/share/classes/sun/net/www/protocol/http/spnego/NegotiateCallbackHandler.java
! test/sun/security/krb5/auto/HttpNegotiateServer.java


From opinali at gmail.com  Thu Mar 18 13:22:47 2010
From: opinali at gmail.com (Osvaldo Doederlein)
Date: Thu, 18 Mar 2010 10:22:47 -0300
Subject: Do Set implementations waste memory?
In-Reply-To: <7d7138c11003171907x4038968bwc5ba27e1661c9988@mail.gmail.com>
References: <212322091003171659o7f57afddg21493451887c5b3e@mail.gmail.com>
	<4BA17F02.7050906@univ-mlv.fr>
	<7d7138c11003171907x4038968bwc5ba27e1661c9988@mail.gmail.com>
Message-ID: <fb5ec5091003180622i3590a6d8r93b50dd58ce0a197@mail.gmail.com>

Hi,

I've tread the google-groups thread, it seems you didn't test on a 64-bit
VM. Could you do that test, with and without CompressedOops, and using
latest HotSpot (7b85 or 6u20ea)? I guess we should see advantages in both
memory savings and speed, at least with CompressedOops.

It is too easy to dismiss an optimization on the basis of "doesn't deliver
benefit on a particular VM". It may be good on a different implementation,
or a different architecture like 32 vs. 64 bits. Object headers, field
layouts, alignments etc., are not portable, and the best rule of thumb is
that any removed field _will_ reduce memory usage at least in some
implementation.

The oldest collection classes were designed for the needs of J2SE 1.2, a
full decade ago. This was discussed before, IIRC there was some reply from
Josh agreeing that some speeed-vs-size tradeoffs made last decade should be
revisited today. The extra runtime size/bloat that a specialized HashSet
implementation would cost, was reasonably significant in 1999 but completely
irrelevant in 2010. I mean, HashSet is a HUGELY important collection, and
the benefit of any optimization of its implementation would spread to many
APIs and applications.

And the problem is not only the extra value field, there is also overhead
from the extra indirection (plus extra polymorphic call) from the HashSet
object to the internal HashMap object. This overhead may sometimes be
sufficient to block inlining and devirtualization, so it's a potentially
bigger cost than just a single extra memory load (which is easily hoisted
out of loops etc.). Look at this code inside HashSet for a much worse cost:

    public Iterator<E> iterator() {
        return map.keySet().iterator();
    }

Yeah we pay the cost of building the internal HashMap's key-set (which is
lazily-built), just to iterate the freaking HashSet. (Notice that
differently from HashMap, a Set is a true Collection that we can iterate
directly without any view-collection of keys/values/entries.)

IMHO all this adds evidence that the current HashSet implementation is a
significant performance bug. We need a brand-new impl that does the hashing
internally, without relying on HashMap, without any unused fields, extra
indirections, or surprising costs like that for iterator(). I guess it would
be relatively simple to copy-paste HashMap's code, cut stuff until just a
Set of keys is left, and merge in the most specific pieces of HashSet
(basically just readObject()/writeObject()).

A+
Osvaldo

2010/3/17 Dimitris Andreou <jim.andreou at gmail.com>

>
> 2010/3/18 R?mi Forax <forax at univ-mlv.fr>
>
> Le 18/03/2010 00:59, Paulo Levi a ?crit :
>>
>>  My understanding is that set implementations are implemented by using
>>> Maps internally + a marker object, and that since Maps are implemented using
>>> arrays of entries this is at least n*3 references more that what is needed,
>>> since there are never multiple values.
>>>
>>> Any plans to change this? I suspect it would be a boon for programs that
>>> use the correct data structure.
>>>
>>
>> You have to test it. My guess is that there will be no difference.
>> As far as I remember, an object needs to be aligned on a valid 64bits
>> address even in 32bits mode,
>> Hotspot uses a 64bits header and the internal hash map entry contains 4
>> ints,
>> if you remove the reference corresponding to the value, the empty place
>> will be
>> considered as garbage and not used.
>>
>
> Else, you can try to remove the internal entry object but in that case
>> the hashcode of the element will be not stored anymore and you will
>> have a slowdown for all objects that doesn't cache their hashcode by
>> itself.
>>
>>
> See my second-to-last post in this thread:
>
> http://groups.google.com/group/guava-discuss/browse_thread/thread/23bc8fa5ae479698
>
> In short, I tested removing the "value" field of a HashMap's entry object,
> and indeed (through Instrumentation#getObjectSize) I observed no reduction
> in memory. I had to remove one further field (e.g. "hash") to make a
> reduction (of 8 bytes per entry).
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100318/1b035ec9/attachment.html>

From Ulf.Zibis at gmx.de  Thu Mar 18 13:54:17 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 18 Mar 2010 14:54:17 +0100
Subject: Do Set implementations waste memory?
In-Reply-To: <fb5ec5091003180622i3590a6d8r93b50dd58ce0a197@mail.gmail.com>
References: <212322091003171659o7f57afddg21493451887c5b3e@mail.gmail.com>	<4BA17F02.7050906@univ-mlv.fr>	<7d7138c11003171907x4038968bwc5ba27e1661c9988@mail.gmail.com>
	<fb5ec5091003180622i3590a6d8r93b50dd58ce0a197@mail.gmail.com>
Message-ID: <4BA23089.6070803@gmx.de>

+1

-Ulf

Am 18.03.2010 14:22, schrieb Osvaldo Doederlein:
>
> The oldest collection classes were designed for the needs of J2SE 1.2, 
> a full decade ago. This was discussed before, IIRC there was some 
> reply from Josh agreeing that some speeed-vs-size tradeoffs made last 
> decade should be revisited today. The extra runtime size/bloat that a 
> specialized HashSet implementation would cost, was reasonably 
> significant in 1999 but completely irrelevant in 2010. I mean, HashSet 
> is a HUGELY important collection, and the benefit of any optimization 
> of its implementation would spread to many APIs and applications.
>
> And the problem is not only the extra value field, there is also 
> overhead from the extra indirection (plus extra polymorphic call) from 
> the HashSet object to the internal HashMap object. This overhead may 
> sometimes be sufficient to block inlining and devirtualization, so 
> it's a potentially bigger cost than just a single extra memory load 
> (which is easily hoisted out of loops etc.). Look at this code inside 
> HashSet for a much worse cost:
>
>     public Iterator<E> iterator() {
>         return map.keySet().iterator();
>     }
>
> Yeah we pay the cost of building the internal HashMap's key-set (which 
> is lazily-built), just to iterate the freaking HashSet. (Notice that 
> differently from HashMap, a Set is a true Collection that we can 
> iterate directly without any view-collection of keys/values/entries.)
>
> IMHO all this adds evidence that the current HashSet implementation is 
> a significant performance bug. We need a brand-new impl that does the 
> hashing internally, without relying on HashMap, without any unused 
> fields, extra indirections, or surprising costs like that for 
> iterator(). I guess it would be relatively simple to copy-paste 
> HashMap's code, cut stuff until just a Set of keys is left, and merge 
> in the most specific pieces of HashSet (basically just 
> readObject()/writeObject()).
>


From Ulf.Zibis at gmx.de  Thu Mar 18 14:09:20 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 18 Mar 2010 15:09:20 +0100
Subject: Do Set implementations waste memory?
In-Reply-To: <4BA23089.6070803@gmx.de>
References: <212322091003171659o7f57afddg21493451887c5b3e@mail.gmail.com>	<4BA17F02.7050906@univ-mlv.fr>	<7d7138c11003171907x4038968bwc5ba27e1661c9988@mail.gmail.com>	<fb5ec5091003180622i3590a6d8r93b50dd58ce0a197@mail.gmail.com>
	<4BA23089.6070803@gmx.de>
Message-ID: <4BA23410.1000204@gmx.de>

... and please consider
Bug 6812862 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862> 
- provide customizable hash() algorithm in HashMap for speed tuning
again and too for HashSet.

-Ulf


Am 18.03.2010 14:54, schrieb Ulf Zibis:
> +1
>
> -Ulf
>
> Am 18.03.2010 14:22, schrieb Osvaldo Doederlein:
>>
>> The oldest collection classes were designed for the needs of J2SE 
>> 1.2, a full decade ago. This was discussed before, IIRC there was 
>> some reply from Josh agreeing that some speeed-vs-size tradeoffs made 
>> last decade should be revisited today. The extra runtime size/bloat 
>> that a specialized HashSet implementation would cost, was reasonably 
>> significant in 1999 but completely irrelevant in 2010. I mean, 
>> HashSet is a HUGELY important collection, and the benefit of any 
>> optimization of its implementation would spread to many APIs and 
>> applications.
>>
>> And the problem is not only the extra value field, there is also 
>> overhead from the extra indirection (plus extra polymorphic call) 
>> from the HashSet object to the internal HashMap object. This overhead 
>> may sometimes be sufficient to block inlining and devirtualization, 
>> so it's a potentially bigger cost than just a single extra memory 
>> load (which is easily hoisted out of loops etc.). Look at this code 
>> inside HashSet for a much worse cost:
>>
>>     public Iterator<E> iterator() {
>>         return map.keySet().iterator();
>>     }
>>
>> Yeah we pay the cost of building the internal HashMap's key-set 
>> (which is lazily-built), just to iterate the freaking HashSet. 
>> (Notice that differently from HashMap, a Set is a true Collection 
>> that we can iterate directly without any view-collection of 
>> keys/values/entries.)
>>
>> IMHO all this adds evidence that the current HashSet implementation 
>> is a significant performance bug. We need a brand-new impl that does 
>> the hashing internally, without relying on HashMap, without any 
>> unused fields, extra indirections, or surprising costs like that for 
>> iterator(). I guess it would be relatively simple to copy-paste 
>> HashMap's code, cut stuff until just a Set of keys is left, and merge 
>> in the most specific pieces of HashSet (basically just 
>> readObject()/writeObject()).
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100318/6b172e31/attachment.html>

From jim.andreou at gmail.com  Thu Mar 18 14:10:01 2010
From: jim.andreou at gmail.com (Dimitris Andreou)
Date: Thu, 18 Mar 2010 14:10:01 +0000
Subject: Do Set implementations waste memory?
In-Reply-To: <fb5ec5091003180622i3590a6d8r93b50dd58ce0a197@mail.gmail.com>
References: <212322091003171659o7f57afddg21493451887c5b3e@mail.gmail.com>
	<4BA17F02.7050906@univ-mlv.fr>
	<7d7138c11003171907x4038968bwc5ba27e1661c9988@mail.gmail.com>
	<fb5ec5091003180622i3590a6d8r93b50dd58ce0a197@mail.gmail.com>
Message-ID: <7d7138c11003180710t35f1d065oac3ffebabf7d271e@mail.gmail.com>

2010/3/18 Osvaldo Doederlein <opinali at gmail.com>

> Hi,
>
> I've tread the google-groups thread, it seems you didn't test on a 64-bit
> VM. Could you do that test, with and without CompressedOops, and using
> latest HotSpot (7b85 or 6u20ea)? I guess we should see advantages in both
> memory savings and speed, at least with CompressedOops.
>
> It is too easy to dismiss an optimization on the basis of "doesn't deliver
> benefit on a particular VM". It may be good on a different implementation,
> or a different architecture like 32 vs. 64 bits. Object headers, field
> layouts, alignments etc., are not portable, and the best rule of thumb is
> that any removed field _will_ reduce memory usage at least in some
> implementation.
>
> The oldest collection classes were designed for the needs of J2SE 1.2, a
> full decade ago. This was discussed before, IIRC there was some reply from
> Josh agreeing that some speeed-vs-size tradeoffs made last decade should be
> revisited today. The extra runtime size/bloat that a specialized HashSet
> implementation would cost, was reasonably significant in 1999 but completely
> irrelevant in 2010. I mean, HashSet is a HUGELY important collection, and
> the benefit of any optimization of its implementation would spread to many
> APIs and applications.
>
> And the problem is not only the extra value field, there is also overhead
> from the extra indirection (plus extra polymorphic call) from the HashSet
> object to the internal HashMap object. This overhead may sometimes be
> sufficient to block inlining and devirtualization, so it's a potentially
> bigger cost than just a single extra memory load (which is easily hoisted
> out of loops etc.). Look at this code inside HashSet for a much worse cost:
>
>     public Iterator<E> iterator() {
>         return map.keySet().iterator();
>     }
>
> Yeah we pay the cost of building the internal HashMap's key-set (which is
> lazily-built), just to iterate the freaking HashSet. (Notice that
> differently from HashMap, a Set is a true Collection that we can iterate
> directly without any view-collection of keys/values/entries.)
>
> IMHO all this adds evidence that the current HashSet implementation is a
> significant performance bug. We need a brand-new impl that does the hashing
> internally, without relying on HashMap, without any unused fields, extra
> indirections, or surprising costs like that for iterator(). I guess it would
> be relatively simple to copy-paste HashMap's code, cut stuff until just a
> Set of keys is left, and merge in the most specific pieces of HashSet
> (basically just readObject()/writeObject()).
>

Hi,

Sorry, I was disappointed by the result and sent the code to /dev/null, so
can't readily test that, but yes, it is a relatively simple exercise. In my
opinion, if someone is going to undertake the task of creating a new
HashSet, he'd better start from a white page, not going the
"HashMap-->snip-->snip-->HashSet" path. Even if for some platforms there
would be some gains through this path, not reducing memory footprint on a
large number of 32-bit platforms would be quite a pity. (About runtime
performance, given the amount of "magic" in the JVM, I dare to say even
less). I find what Martin suggests on that thread (his second-to-last post)
a quite promising alternative (open addressing plus two parallel arrays, for
keys and for hashes).

Just my 2c.

I would love to know in more detail Doug's opinion on the matter.

Dimitris


>
> A+
> Osvaldo
>
> 2010/3/17 Dimitris Andreou <jim.andreou at gmail.com>
>
>>
>> 2010/3/18 R?mi Forax <forax at univ-mlv.fr>
>>
>> Le 18/03/2010 00:59, Paulo Levi a ?crit :
>>>
>>>  My understanding is that set implementations are implemented by using
>>>> Maps internally + a marker object, and that since Maps are implemented using
>>>> arrays of entries this is at least n*3 references more that what is needed,
>>>> since there are never multiple values.
>>>>
>>>> Any plans to change this? I suspect it would be a boon for programs that
>>>> use the correct data structure.
>>>>
>>>
>>> You have to test it. My guess is that there will be no difference.
>>> As far as I remember, an object needs to be aligned on a valid 64bits
>>> address even in 32bits mode,
>>> Hotspot uses a 64bits header and the internal hash map entry contains 4
>>> ints,
>>> if you remove the reference corresponding to the value, the empty place
>>> will be
>>> considered as garbage and not used.
>>>
>>
>> Else, you can try to remove the internal entry object but in that case
>>> the hashcode of the element will be not stored anymore and you will
>>> have a slowdown for all objects that doesn't cache their hashcode by
>>> itself.
>>>
>>>
>> See my second-to-last post in this thread:
>>
>> http://groups.google.com/group/guava-discuss/browse_thread/thread/23bc8fa5ae479698
>>
>> In short, I tested removing the "value" field of a HashMap's entry object,
>> and indeed (through Instrumentation#getObjectSize) I observed no reduction
>> in memory. I had to remove one further field (e.g. "hash") to make a
>> reduction (of 8 bytes per entry).
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100318/64ec4a0e/attachment.html>

From yu-ching.peng at sun.com  Fri Mar 19 01:08:20 2010
From: yu-ching.peng at sun.com (yu-ching.peng at sun.com)
Date: Fri, 19 Mar 2010 01:08:20 +0000
Subject: hg: jdk7/tl/jdk: 3 new changesets
Message-ID: <20100319010953.BA4894417C@hg.openjdk.java.net>

Changeset: c52f292a8f86
Author:    valeriep
Date:      2010-03-18 17:05 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/c52f292a8f86

6695485: SignedObject constructor throws ProviderException if it's called using provider "SunPKCS11-Solaris"
Summary: Added checking for RSA key lengths in initSign and initVerify
Reviewed-by: vinnie

! src/share/classes/sun/security/pkcs11/P11Signature.java
+ test/sun/security/pkcs11/Signature/TestRSAKeyLength.java

Changeset: df5714cbe76d
Author:    valeriep
Date:      2010-03-18 17:32 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/df5714cbe76d

6591117: Poor preformance of PKCS#11 security provider compared to Sun default provider
Summary: Added internal buffering to PKCS11 SecureRandom impl
Reviewed-by: wetmore

! src/share/classes/sun/security/pkcs11/P11SecureRandom.java

Changeset: dc42c9d9ca16
Author:    valeriep
Date:      2010-03-18 17:56 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/dc42c9d9ca16

6837847: PKCS#11 A SecureRandom and a serialization error following installation of 1.5.0_18
Summary: Added a custom readObject method to PKCS11 SecureRandom impl
Reviewed-by: wetmore

! src/share/classes/sun/security/pkcs11/P11SecureRandom.java
+ test/sun/security/pkcs11/SecureRandom/TestDeserialization.java


From lana.steuck at sun.com  Fri Mar 19 05:17:04 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 19 Mar 2010 05:17:04 +0000
Subject: hg: jdk7/tl: 3 new changesets
Message-ID: <20100319051704.78292441C6@hg.openjdk.java.net>

Changeset: 3ddf90b39176
Author:    mikejwre
Date:      2010-03-04 13:50 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/3ddf90b39176

Added tag jdk7-b85 for changeset cf26288a114b

! .hgtags

Changeset: 433a60a9c0bf
Author:    lana
Date:      2010-03-09 15:28 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/433a60a9c0bf

Merge


Changeset: 98505d97a822
Author:    lana
Date:      2010-03-18 18:50 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/98505d97a822

Merge


From lana.steuck at sun.com  Fri Mar 19 05:17:11 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 19 Mar 2010 05:17:11 +0000
Subject: hg: jdk7/tl/corba: Added tag jdk7-b85 for changeset c67a9df7bc0c
Message-ID: <20100319051713.B7AF2441C7@hg.openjdk.java.net>

Changeset: 6253e28826d1
Author:    mikejwre
Date:      2010-03-04 13:50 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/corba/rev/6253e28826d1

Added tag jdk7-b85 for changeset c67a9df7bc0c

! .hgtags


From lana.steuck at sun.com  Fri Mar 19 05:19:23 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 19 Mar 2010 05:19:23 +0000
Subject: hg: jdk7/tl/hotspot: 2 new changesets
Message-ID: <20100319051931.87330441C8@hg.openjdk.java.net>

Changeset: 418bc80ce139
Author:    mikejwre
Date:      2010-03-04 13:50 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/418bc80ce139

Added tag jdk7-b85 for changeset 6c9796468b91

! .hgtags

Changeset: bf823ef06b4f
Author:    trims
Date:      2010-03-08 15:50 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/hotspot/rev/bf823ef06b4f

Added tag hs17-b10 for changeset 418bc80ce139

! .hgtags


From lana.steuck at sun.com  Fri Mar 19 05:23:25 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 19 Mar 2010 05:23:25 +0000
Subject: hg: jdk7/tl/jaxp: Added tag jdk7-b85 for changeset 6c0ccabb430d
Message-ID: <20100319052325.DADBE441CA@hg.openjdk.java.net>

Changeset: 81c0f115bbe5
Author:    mikejwre
Date:      2010-03-04 13:50 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jaxp/rev/81c0f115bbe5

Added tag jdk7-b85 for changeset 6c0ccabb430d

! .hgtags


From lana.steuck at sun.com  Fri Mar 19 05:23:32 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 19 Mar 2010 05:23:32 +0000
Subject: hg: jdk7/tl/jaxws: Added tag jdk7-b85 for changeset 8424512588ff
Message-ID: <20100319052332.21C08441CB@hg.openjdk.java.net>

Changeset: 512b0e924a5a
Author:    mikejwre
Date:      2010-03-04 13:50 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jaxws/rev/512b0e924a5a

Added tag jdk7-b85 for changeset 8424512588ff

! .hgtags


From lana.steuck at sun.com  Fri Mar 19 05:24:32 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 19 Mar 2010 05:24:32 +0000
Subject: hg: jdk7/tl/jdk: 22 new changesets
Message-ID: <20100319053136.DA606441CE@hg.openjdk.java.net>

Changeset: 03cd9e62961f
Author:    mikejwre
Date:      2010-03-04 13:50 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/03cd9e62961f

Added tag jdk7-b85 for changeset b396584a3e64

! .hgtags

Changeset: 840601ac5ab7
Author:    rkennke
Date:      2010-03-03 15:50 +0100
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/840601ac5ab7

6892485: Deadlock in SunGraphicsEnvironment / FontManager
Summary: Synchronize on correct monitor in SunFontManager.
Reviewed-by: igor, prr

! src/share/classes/sun/font/SunFontManager.java

Changeset: 1d7db2d5c4c5
Author:    minqi
Date:      2010-03-08 11:35 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/1d7db2d5c4c5

6918065: Crash in Java2D blit loop (IntArgbToIntArgbPreSrcOverMaskBlit) in 64bit mode
Reviewed-by: igor, bae

! src/share/classes/java/awt/AlphaComposite.java
+ test/java/awt/AlphaComposite/TestAlphaCompositeForNaN.java

Changeset: 494f5e4f24da
Author:    lana
Date:      2010-03-09 15:26 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/494f5e4f24da

Merge


Changeset: e64331144648
Author:    rupashka
Date:      2010-02-10 15:15 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/e64331144648

6848475: JSlider does not display the correct value of its BoundedRangeModel
Reviewed-by: peterz

! src/share/classes/javax/swing/plaf/basic/BasicSliderUI.java
+ test/javax/swing/JSlider/6848475/bug6848475.java

Changeset: f81c8041ccf4
Author:    peytoia
Date:      2010-02-11 15:58 +0900
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/f81c8041ccf4

6909002: Remove indicim.jar and thaiim.jar from JRE and move to samples if needed
Reviewed-by: okutsu

! make/com/sun/Makefile

Changeset: e2b58a45a426
Author:    peytoia
Date:      2010-02-12 14:38 +0900
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/e2b58a45a426

6921289: (tz) Support tzdata2010b
Reviewed-by: okutsu

! make/sun/javazic/tzdata/VERSION
! make/sun/javazic/tzdata/antarctica
! make/sun/javazic/tzdata/asia
! make/sun/javazic/tzdata/australasia
! make/sun/javazic/tzdata/europe
! make/sun/javazic/tzdata/northamerica
! make/sun/javazic/tzdata/zone.tab
! src/share/classes/sun/util/resources/TimeZoneNames.java
! src/share/classes/sun/util/resources/TimeZoneNames_de.java
! src/share/classes/sun/util/resources/TimeZoneNames_es.java
! src/share/classes/sun/util/resources/TimeZoneNames_fr.java
! src/share/classes/sun/util/resources/TimeZoneNames_it.java
! src/share/classes/sun/util/resources/TimeZoneNames_ja.java
! src/share/classes/sun/util/resources/TimeZoneNames_ko.java
! src/share/classes/sun/util/resources/TimeZoneNames_sv.java
! src/share/classes/sun/util/resources/TimeZoneNames_zh_CN.java
! src/share/classes/sun/util/resources/TimeZoneNames_zh_TW.java

Changeset: e8340332745e
Author:    malenkov
Date:      2010-02-18 17:46 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/e8340332745e

4498236: RFE: Provide a toString method for PropertyChangeEvent and other classes
Reviewed-by: peterz

! src/share/classes/java/beans/BeanDescriptor.java
! src/share/classes/java/beans/EventSetDescriptor.java
! src/share/classes/java/beans/FeatureDescriptor.java
! src/share/classes/java/beans/IndexedPropertyChangeEvent.java
! src/share/classes/java/beans/IndexedPropertyDescriptor.java
! src/share/classes/java/beans/MethodDescriptor.java
! src/share/classes/java/beans/PropertyChangeEvent.java
! src/share/classes/java/beans/PropertyDescriptor.java
+ test/java/beans/Introspector/Test4498236.java

Changeset: 5c03237838e1
Author:    rupashka
Date:      2010-02-27 14:26 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/5c03237838e1

6913758: Specification for SynthViewportUI.paintBorder(...) should mention that this method is never called
Reviewed-by: peterz

! src/share/classes/javax/swing/plaf/synth/SynthViewportUI.java

Changeset: 96205ed1b196
Author:    rupashka
Date:      2010-02-27 14:47 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/96205ed1b196

6918447: SynthToolBarUI.setBorderToXXXX() methods don't correspond inherited spec. They do nothing.
Reviewed-by: peterz

! src/share/classes/javax/swing/plaf/synth/SynthToolBarUI.java

Changeset: 621e921a14cd
Author:    rupashka
Date:      2010-02-27 15:09 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/621e921a14cd

6918861: SynthSliderUI.uninstallDefaults() is not called when UI is uninstalled
Reviewed-by: malenkov

! src/share/classes/javax/swing/plaf/basic/BasicSliderUI.java
! src/share/classes/javax/swing/plaf/synth/SynthSliderUI.java
+ test/javax/swing/JSlider/6918861/bug6918861.java

Changeset: 28741de0bb4a
Author:    rupashka
Date:      2010-02-27 16:03 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/28741de0bb4a

6923305: SynthSliderUI paints the slider track when the slider's "paintTrack" property is set to false
Reviewed-by: alexp

! src/share/classes/javax/swing/plaf/synth/SynthSliderUI.java
+ test/javax/swing/JSlider/6923305/bug6923305.java

Changeset: 2bf137beb9bd
Author:    rupashka
Date:      2010-02-27 16:14 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/2bf137beb9bd

6929298: The SynthSliderUI#calculateTickRect method should be removed
Reviewed-by: peterz

! src/share/classes/javax/swing/plaf/synth/SynthSliderUI.java

Changeset: d6b3a07c8752
Author:    rupashka
Date:      2010-03-03 17:57 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/d6b3a07c8752

6924059: SynthScrollBarUI.configureScrollBarColors() should have spec different from the overridden method
Reviewed-by: peterz

! src/share/classes/javax/swing/plaf/synth/SynthScrollBarUI.java
+ test/javax/swing/JScrollBar/6924059/bug6924059.java

Changeset: 30c520bd148f
Author:    rupashka
Date:      2010-03-03 20:08 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/30c520bd148f

6913768: With default SynthLookAndFeel instance installed new JTable creation leads to throwing NPE
Reviewed-by: peterz

! src/share/classes/javax/swing/JTable.java
! src/share/classes/javax/swing/plaf/synth/SynthTableUI.java
+ test/javax/swing/JTable/6913768/bug6913768.java

Changeset: f13fc955be62
Author:    rupashka
Date:      2010-03-03 20:53 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/f13fc955be62

6917744: JScrollPane Page Up/Down keys do not handle correctly html tables with different cells contents
Reviewed-by: peterz, alexp

! src/share/classes/javax/swing/text/DefaultEditorKit.java
+ test/javax/swing/JEditorPane/6917744/bug6917744.java
+ test/javax/swing/JEditorPane/6917744/test.html

Changeset: 0622086d82ac
Author:    malenkov
Date:      2010-03-04 21:17 +0300
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/0622086d82ac

6921644: XMLEncoder generates invalid XML
Reviewed-by: peterz

! src/share/classes/java/beans/XMLEncoder.java
+ test/java/beans/XMLEncoder/Test5023550.java
+ test/java/beans/XMLEncoder/Test5023557.java
+ test/java/beans/XMLEncoder/Test6921644.java

Changeset: 79a509ac8f35
Author:    lana
Date:      2010-03-01 18:30 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/79a509ac8f35

Merge

! make/com/sun/Makefile
- make/java/text/FILES_java.gmk

Changeset: 90248595ec35
Author:    lana
Date:      2010-03-04 13:07 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/90248595ec35

Merge


Changeset: 2fe4e72288ce
Author:    lana
Date:      2010-03-09 15:28 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/2fe4e72288ce

Merge


Changeset: eae6e9ab2606
Author:    lana
Date:      2010-03-09 15:29 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/eae6e9ab2606

Merge

- test/java/nio/file/WatchService/OverflowEventIsLoner.java

Changeset: dff4f51b73d4
Author:    lana
Date:      2010-03-18 18:52 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/dff4f51b73d4

Merge


From lana.steuck at sun.com  Fri Mar 19 05:37:45 2010
From: lana.steuck at sun.com (lana.steuck at sun.com)
Date: Fri, 19 Mar 2010 05:37:45 +0000
Subject: hg: jdk7/tl/langtools: 3 new changesets
Message-ID: <20100319053756.31267441D1@hg.openjdk.java.net>

Changeset: b816baf594e3
Author:    mikejwre
Date:      2010-03-04 13:50 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/b816baf594e3

Added tag jdk7-b85 for changeset 136bfc679462

! .hgtags

Changeset: ef07347428f2
Author:    lana
Date:      2010-03-09 15:29 -0800
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/ef07347428f2

Merge

- test/tools/javac/treepostests/TreePosTest.java

Changeset: 6fad35d25b1e
Author:    lana
Date:      2010-03-18 18:52 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/6fad35d25b1e

Merge


From christopher.hegarty at sun.com  Fri Mar 19 13:27:27 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Fri, 19 Mar 2010 13:27:27 +0000
Subject: hg: jdk7/tl/jdk: 6935233:
	java/net/ServerSocket/AcceptCauseFileDescriptorLeak.java fails
	with modules build
Message-ID: <20100319132803.016AC44258@hg.openjdk.java.net>

Changeset: 3bb93c410f41
Author:    chegar
Date:      2010-03-19 13:07 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/3bb93c410f41

6935233: java/net/ServerSocket/AcceptCauseFileDescriptorLeak.java fails with modules build
Reviewed-by: alanb

! test/ProblemList.txt
! test/java/net/ServerSocket/AcceptCauseFileDescriptorLeak.java
+ test/java/net/ServerSocket/AcceptCauseFileDescriptorLeak.sh


From Xueming.Shen at Sun.COM  Fri Mar 19 19:56:46 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Fri, 19 Mar 2010 11:56:46 -0800
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003161500o3d41e3felb2ab619f27095082@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com>
	<1ccfd1c11003161500o3d41e3felb2ab619f27095082@mail.gmail.com>
Message-ID: <4BA3D6FE.1010800@sun.com>

Martin Buchholz wrote:
> I renamed my patch file from isSupplementaryCodePoint to isValidCodePoint.
>
> 6934268: Better implementation of Character.isValidCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isValidCodePoint
>   

It's fine. But if I was you I would not "optimize" it.

> 6934265: Add public method Character.isBMPCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
>   
Looks fine. I will let you know when the CCC is approved.

> 6934270: Remove javac warnings from Character.java
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings
>   
Looks fine.

> 6934271: Better handling of longer utf-8 sequences
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/utf8-twiddling
>   
Looks good, though the code style looks really really...strange:-)


> 6935172: Optimize bit-twiddling in Bits.java
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Bits.java
>   
Looks fine. I was surprised the javac compiler really generates the code 
for expr + 0 and  expr << 0.
I kinda remember the gcc compiler cat optimize this kind situation to 
just expr (If my memory is correct,
or maybe that was kinda of optimization I was planning to do in one of 
my projects :-) ).

-Sherman


> Martin
>
> On Tue, Mar 16, 2010 at 15:35, Xueming Shen <Xueming.Shen at sun.com> wrote:
>   
>> Martin Buchholz wrote:
>>     
>>> On Tue, Mar 16, 2010 at 13:06, Xueming Shen <Xueming.Shen at sun.com> wrote:
>>>
>>>       
>>>> Martin Buchholz wrote:
>>>>
>>>>         
>>>>> Therefore the existing implementation
>>>>>
>>>>>           
>>>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>>>           &&  codePoint<= MAX_CODE_POINT;
>>>>>>>
>>>>>>> will almost always perform just one comparison against a constant,
>>>>>>> which is hard to beat.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>>> against
>>>>>> only 8 bit values in my case.
>>>>>>
>>>>>>
>>>>>>             
>>>>> It's a good point.  In the machine code, shifts are likely to use
>>>>> immediate values, and so will be a small win.
>>>>>
>>>>> int x = codePoint >>> 16;
>>>>> return x != 0 && x < 0x11;
>>>>>
>>>>> (On modern hardware, these optimizations
>>>>> are less valuable than they used to be;
>>>>> ordinary integer arithmetic is almost free)
>>>>>
>>>>>
>>>>>
>>>>>           
>>>> I'm not convinced if the proposed code is really better...a "small win".
>>>>
>>>>         
>>> The primary theory here is that branches are expensive,
>>> and we are reducing them by one.
>>>
>>>
>>>       
>> There are still two branches in new impl, if you count the "ifeq" and
>> "if_icmpge"(?)
>>
>> We are trying to "optimize" this piece of code with the assumption that the
>> new impl MIGHT help certain vm (hotspot?)
>> to optimize certain use scenario (some consecutive usages), if the compiler
>> and/or the vm are both smart enough at certain
>> point, with no supporting benchmark data?
>>
>> My concern is that the reality might be that this optimization might even
>> hurt the BMP use
>> case (the majority of the possible real world use scenarios) with a 10%
>> bigger bytecode size.
>>
>> -Sherman
>>
>>
>>
>> public class Character extends java.lang.Object {
>>  public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>>
>>  public static final int MAX_CODE_POINT = 1114111;
>>
>>  public Character();
>>   Code:
>>      0: aload_0             1: invokespecial #1                  // Method
>> java/lang/Object."<init>":()V
>>      4: return
>>  public static boolean isSupplementaryCodePoint(int);
>>   Code:
>>      0: iload_0             1: ldc           #2                  // int
>> 65536
>>      3: if_icmplt     16
>>      6: iload_0             7: ldc           #3                  // int
>> 1114111
>>      9: if_icmpgt     16
>>     12: iconst_1           13: goto          17
>>     16: iconst_0           17: ireturn
>>  public static boolean isSupplementaryCodePoint_new(int);
>>   Code:
>>      0: iload_0             1: bipush        16
>>      3: iushr               4: istore_1
>>      5: iload_1             6: ifeq          19
>>      9: iload_1            10: bipush        17
>>     12: if_icmpge     19
>>     15: iconst_1           16: goto          20
>>     19: iconst_0           20: ireturn       }
>>
>>
>>
>>
>>
>>
>>     


From Ulf.Zibis at gmx.de  Fri Mar 19 20:29:37 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 19 Mar 2010 21:29:37 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003170846n19c3e273v71daff3a755c58a4@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA00564.4010104@gmx.de>	
	<4BA09478.90809@gmx.de> <4BA09CDC.1040402@gmx.de>
	<1ccfd1c11003170846n19c3e273v71daff3a755c58a4@mail.gmail.com>
Message-ID: <4BA3DEB1.9080303@gmx.de>

Am 17.03.2010 16:46, schrieb Martin Buchholz:
> On Wed, Mar 17, 2010 at 01:11, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am I mad ???
>>
>> 2nd. correction:
>>
>> But
>>         for (int i = offset; i<  offset + count; i++) {
>>             int c = codePoints[i];
>>             char plane = (char)(c>>>  16);
>>             if (plane == 0)
>>                 n += 1;
>>             else if (plane<  0x11)
>>                 n += 2;
>>             else throw new IllegalArgumentException(Integer.toString(c));
>>         }
>> has too only 2 branches and additionally could benefit from tiny 16-bit
>> comparisons.
>> The shift additionally could be omitted on CPU's which can benefit from
>> 6933327.
>>      
> I'm not a x86 or hotspot expert, but I would think that the "plane"
> variable is never written to memory, but lives only in a register,
> so I see only drawbacks to making plane a "char".
>    

The char is not important here, maybe give hotspot a hint that value is 
always positive 16-bit. My idea was to indicate this to the reader.

I saw, that you use to set a space after casts, why? Cast is a 
one-operand operator like - -- ++. This a rare style in the JDK sources 
which "disturbs" my eyes. ;-)

-Ulf


From martinrb at google.com  Fri Mar 19 20:47:22 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 19 Mar 2010 13:47:22 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA3DEB1.9080303@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA00564.4010104@gmx.de>
	<4BA09478.90809@gmx.de> <4BA09CDC.1040402@gmx.de>
	<1ccfd1c11003170846n19c3e273v71daff3a755c58a4@mail.gmail.com>
	<4BA3DEB1.9080303@gmx.de>
Message-ID: <1ccfd1c11003191347p1f633502kc8ab34ef119f2a29@mail.gmail.com>

On Fri, Mar 19, 2010 at 13:29,  schrieb Ulf Zibis <Ulf.Zibis at gmx.de>:
> Am 17.03.2010 16:46, schrieb Martin Buchholz:

> The char is not important here, maybe give hotspot a hint that value is
> always positive 16-bit. My idea was to indicate this to the reader.

I think naming the variable "plane" and using the ">>>" operator
do a good job of making this hint to the reader.

>
> I saw, that you use to set a space after casts, why? Cast is a one-operand
> operator like - -- ++. This a rare style in the JDK sources which "disturbs"
> my eyes. ;-)

The JDK code I have maintained uses space after cast.
We don't have a really well-maintained coding standard, but
the closest thing we do have agrees with me:

http://java.sun.com/docs/codeconv/html/CodeConventions.doc7.html#475

Nevertheless, you are right - I was surprised that space after cast
is less popular in the JDK sources.

Martin


From Ulf.Zibis at gmx.de  Fri Mar 19 21:27:00 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 19 Mar 2010 22:27:00 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4BA10BEC.8050105@sun.com>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA00564.4010104@gmx.de>
	<4BA09478.90809@gmx.de> <4BA09CDC.1040402@gmx.de>
	<1ccfd1c11003170846n19c3e273v71daff3a755c58a4@mail.gmail.com>
	<4BA10BEC.8050105@sun.com>
Message-ID: <4BA3EC24.7010809@gmx.de>

Am 17.03.2010 18:05, schrieb Xueming Shen:
> Martin Buchholz wrote:
>> On Wed, Mar 17, 2010 at 01:11, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>> Am I mad ???
>>>
>>> 2nd. correction:
>>>
>>> But
>>>        for (int i = offset; i < offset + count; i++) {
>>>            int c = codePoints[i];
>>>            char plane = (char)(c >>> 16);
>>>            if (plane == 0)
>>>                n += 1;
>>>            else if (plane < 0x11)
>>>                n += 2;
>>>            else throw new 
>>> IllegalArgumentException(Integer.toString(c));
>>>        }
>>> has too only 2 branches and additionally could benefit from tiny 16-bit
>>> comparisons.
>>> The shift additionally could be omitted on CPU's which can benefit from
>>> 6933327.
>>
>> I'm not a x86 or hotspot expert, but I would think that the "plane"
>> variable is never written to memory, but lives only in a register,
>> so I see only drawbacks to making plane a "char".
>>
> I doubt there is any benefit to use a 8-bit or 16-bit operand on a 
> 32-bit/64-bit machine.
> While optimization is definitely good, but it might not be a good 
> habit to code in high-level
> program language while thinking in assembly every each minute:-)  
> let's leave those
> optimization to hotspot engineer:-)

Yes, you are right. Unfortunately they are on delay with such things. As 
I said, the "(c >>> 16) == 0"-trick will loose it's justification, if 
JIT would be so smart, to convert a range check into a single unsigned 
compare.

-Ulf


> In this particular case, given most application will never use 
> supplementary character, I
> doubt it really worth the optimization and I would definitely not try 
> to change the impl
> of isSupplementaryCP to make this code "better". If you really really 
> want to optimize
> this code the alternative is to have a package private 
> Character.getPlane(), or simply
> to use your optimized code above. I would suggest to use int for 
> plane, instead of char or
> byte.
>
> -Sherman
>
>
>
>
>


From Ulf.Zibis at gmx.de  Fri Mar 19 21:46:29 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 19 Mar 2010 22:46:29 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4BA007A4.2030907@sun.com>
References: <4A95079A.8080803@gmx.de>
	<4A9578C4.8060801@sun.com>	<4B8DA070.3040306@gmx.de>	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	<4B8E3DA3.7090902@gmx.de>	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	<4B9FE4DD.1090405@sun.com>	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com>
Message-ID: <4BA3F0B5.1070404@gmx.de>

Am 16.03.2010 23:35, schrieb Xueming Shen:
> Martin Buchholz wrote:
>>
>> The primary theory here is that branches are expensive,
>> and we are reducing them by one.
>>
>
> There are still two branches in new impl, if you count the "ifeq" and 
> "if_icmpge"(?)
>
> We are trying to "optimize" this piece of code with the assumption 
> that the new impl MIGHT help certain vm (hotspot?)
> to optimize certain use scenario (some consecutive usages), if the 
> compiler and/or the vm are both smart enough at certain
> point, with no supporting benchmark data?

I've finished the benchmark:
https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/branches/JDK-7/j_l_Character_charCount/src/java/lang/CharacterBenchmark.java?rev=1006&view=log

The results:
time1: 2316,213 ms  ..? la Martin
time2: 1267,063 ms
time3: 1245,972 ms  ..using isValidCodePoint
time4: 1467,570 ms  ..validate version   (slower, because of 
unreasonable HotSpot optimizing, see "C2 optimization bug ?" in 
hotspot-compiler-dev list)

Here see the disassembly snippets:
https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/branches/JDK-7/j_l_Character_charCount/log/PA_Character_compare.txt?rev=1007&view=markup

Detailed:
https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/branches/JDK-7/j_l_Character_charCount/log/PA_Character.xml?rev=1006&view=markup

Little NetBeans project:
https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/branches/JDK-7/j_l_Character_charCount/

Now I have two patches in my mq queue.
Martin, how do I create 2 exports in the form, you would like?
Should I use hg export with some magic option?

-Ulf


From martinrb at google.com  Sat Mar 20 00:13:13 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 19 Mar 2010 17:13:13 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA3F0B5.1070404@gmx.de>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
Message-ID: <1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>

Interesting benchmark results!

Your microbenchmark technique looks unusual, but seems to work.

I'm surprised there is that much difference.

I would take out the swallowing of Exception.

---

Your data contains only supplementary characters,
which we are assuming are very rare.
So I don't consider speeding up such a benchmark
very important, but....

We can do it for free
by switching isSupplementaryCodePoint => isValidCodePoint,
so why not?

---

While checking this, I noticed that Character.toChars can
be sped up by using our new isBMPCodePoint method
(always optimize for BMP!)

---

Here's the change I'm making on top of isBMPCodePoint:
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2/

Ulf, please review.

diff --git a/src/share/classes/java/lang/Character.java
b/src/share/classes/java/lang/Character.java
--- a/src/share/classes/java/lang/Character.java
+++ b/src/share/classes/java/lang/Character.java
@@ -3099,15 +3099,15 @@
      * @since  1.5
      */
     public static int toChars(int codePoint, char[] dst, int dstIndex) {
-        if (codePoint < 0 || codePoint > MAX_CODE_POINT) {
+        if (isBMPCodePoint(codePoint)) {
+            dst[dstIndex] = (char) codePoint;
+            return 1;
+        } else if (isValidCodePoint(codePoint)) {
+            toSurrogates(codePoint, dst, dstIndex);
+            return 2;
+        } else {
             throw new IllegalArgumentException();
         }
-        if (codePoint < MIN_SUPPLEMENTARY_CODE_POINT) {
-            dst[dstIndex] = (char) codePoint;
-            return 1;
-        }
-        toSurrogates(codePoint, dst, dstIndex);
-        return 2;
     }

     /**
@@ -3127,15 +3127,15 @@
      * @since  1.5
      */
     public static char[] toChars(int codePoint) {
-        if (codePoint < 0 || codePoint > MAX_CODE_POINT) {
+        if (isBMPCodePoint(codePoint)) {
+            return new char[] { (char) codePoint };
+        } else if (isValidCodePoint(codePoint)) {
+            char[] result = new char[2];
+            toSurrogates(codePoint, result, 0);
+            return result;
+        } else {
             throw new IllegalArgumentException();
         }
-        if (codePoint < MIN_SUPPLEMENTARY_CODE_POINT) {
-                return new char[] { (char) codePoint };
-        }
-        char[] result = new char[2];
-        toSurrogates(codePoint, result, 0);
-        return result;
     }

     static void toSurrogates(int codePoint, char[] dst, int index) {
diff --git a/src/share/classes/java/lang/String.java
b/src/share/classes/java/lang/String.java
--- a/src/share/classes/java/lang/String.java
+++ b/src/share/classes/java/lang/String.java
@@ -281,7 +281,7 @@
             int c = codePoints[i];
             if (Character.isBMPCodePoint(c))
                 n += 1;
-            else if (Character.isSupplementaryCodePoint(c))
+            else if (Character.isValidCodePoint(c))
                 n += 2;
             else throw new IllegalArgumentException(Integer.toString(c));
         }
diff --git a/src/share/classes/sun/nio/cs/Surrogate.java
b/src/share/classes/sun/nio/cs/Surrogate.java
--- a/src/share/classes/sun/nio/cs/Surrogate.java
+++ b/src/share/classes/sun/nio/cs/Surrogate.java
@@ -294,7 +294,7 @@
                 dst.put((char)uc);
                 error = null;
                 return 1;
-            } else if (Character.isSupplementaryCodePoint(uc)) {
+            } else if (Character.isValidCodePoint(uc)) {
                 if (dst.remaining() < 2) {
                     error = CoderResult.OVERFLOW;
                     return -1;
@@ -338,7 +338,7 @@
                 da[dp] = (char)uc;
                 error = null;
                 return 1;
-            } else if (Character.isSupplementaryCodePoint(uc)) {
+            } else if (Character.isValidCodePoint(uc)) {
                 if (dl - dp < 2) {
                     error = CoderResult.OVERFLOW;
                     return -1;

Martin

On Fri, Mar 19, 2010 at 14:46, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 16.03.2010 23:35, schrieb Xueming Shen:
>>
>> Martin Buchholz wrote:
>>>
>>> The primary theory here is that branches are expensive,
>>> and we are reducing them by one.
>>>
>>
>> There are still two branches in new impl, if you count the "ifeq" and
>> "if_icmpge"(?)
>>
>> We are trying to "optimize" this piece of code with the assumption that
>> the new impl MIGHT help certain vm (hotspot?)
>> to optimize certain use scenario (some consecutive usages), if the
>> compiler and/or the vm are both smart enough at certain
>> point, with no supporting benchmark data?
>
> I've finished the benchmark:
> https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/branches/JDK-7/j_l_Character_charCount/src/java/lang/CharacterBenchmark.java?rev=1006&view=log
>
> The results:
> time1: 2316,213 ms ?..? la Martin
> time2: 1267,063 ms
> time3: 1245,972 ms ?..using isValidCodePoint
> time4: 1467,570 ms ?..validate version ? (slower, because of unreasonable
> HotSpot optimizing, see "C2 optimization bug ?" in hotspot-compiler-dev
> list)
>
> Here see the disassembly snippets:
> https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/branches/JDK-7/j_l_Character_charCount/log/PA_Character_compare.txt?rev=1007&view=markup
>


From martinrb at google.com  Sat Mar 20 00:17:59 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 19 Mar 2010 17:17:59 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA3F0B5.1070404@gmx.de>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>
	<4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
Message-ID: <1ccfd1c11003191717g5ff27fd1m100985fd780fabc4@mail.gmail.com>

On Fri, Mar 19, 2010 at 14:46, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 16.03.2010 23:35, schrieb Xueming Shen:

> Now I have two patches in my mq queue.
> Martin, how do I create 2 exports in the form, you would like?

Just copy the patch files to some public web-accessible place,
as I do with cr.openjdk.java.net.  It seems you do that with
your java.net projects, but you should really put them on
cr.openjdk.java.net, since it's designed for that purpose.

Martin


From kelly.ohair at sun.com  Sat Mar 20 01:18:24 2010
From: kelly.ohair at sun.com (kelly.ohair at sun.com)
Date: Sat, 20 Mar 2010 01:18:24 +0000
Subject: hg: jdk7/tl: 6936788: Minor adjustment to top repo test/Makefile,
	missing non-zero exit case
Message-ID: <20100320011825.0F4AE44311@hg.openjdk.java.net>

Changeset: 35d272ef7598
Author:    ohair
Date:      2010-03-19 18:17 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/rev/35d272ef7598

6936788: Minor adjustment to top repo test/Makefile, missing non-zero exit case
Reviewed-by: jjg

! test/Makefile


From martinrb at google.com  Sat Mar 20 05:01:07 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 19 Mar 2010 22:01:07 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8DA070.3040306@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
Message-ID: <1ccfd1c11003192201s222a233fi6f045d4d0e88febb@mail.gmail.com>

Here's another little improvement that should use isBMPCodePoint:

diff --git a/src/share/classes/java/lang/AbstractStringBuilder.java
b/src/share/classes/java/lang/AbstractStringBuilder.java
--- a/src/share/classes/java/lang/AbstractStringBuilder.java
+++ b/src/share/classes/java/lang/AbstractStringBuilder.java
@@ -719,20 +719,17 @@
      * {@code codePoint} isn't a valid Unicode code point
      */
     public AbstractStringBuilder appendCodePoint(int codePoint) {
-        if (!Character.isValidCodePoint(codePoint)) {
+        if (Character.isBMPCodePoint(codePoint)) {
+            ensureCapacityInternal(count + 1);
+            value[count] = (char) codePoint;
+            count += 1;
+        } else if (Character.isValidCodePoint(codePoint)) {
+            ensureCapacityInternal(count + 2);
+            Character.toSurrogates(codePoint, value, count);
+            count += 2;
+        } else {
             throw new IllegalArgumentException();
         }
-        int n = 1;
-        if (codePoint >= Character.MIN_SUPPLEMENTARY_CODE_POINT) {
-            n++;
-        }
-        ensureCapacityInternal(count + n);
-        if (n == 1) {
-            value[count++] = (char) codePoint;
-        } else {
-            Character.toSurrogates(codePoint, value, count);
-            count += n;
-        }
         return this;
     }


Martin


From martinrb at google.com  Sat Mar 20 18:36:22 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sat, 20 Mar 2010 11:36:22 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
Message-ID: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>

For a change, here's an actual plain old "incorrect result" bug fix
for String.lastIndexOf

Sherman, please file a bug and review.

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf/

Also includes our usual performance-oriented fiddling.

public class LastIndexOf {
    public static void main(String[] args) {
        int ch = 0x10042;
        char[] bug = new char[3];
        Character.toChars(ch, bug, 0);
        bug[2] = bug[0];
        System.out.println(new String(bug).lastIndexOf(ch));
        bug[2] = '!';
        System.out.println(new String(bug).lastIndexOf(ch));
    }
}
==> javac -source 1.6 -Xlint:all LastIndexOf.java
==> java -esa -ea LastIndexOf
-1
0


From Ulf.Zibis at gmx.de  Sat Mar 20 21:52:32 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 20 Mar 2010 22:52:32 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>	
	<4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
Message-ID: <4BA543A0.2060600@gmx.de>

Am 20.03.2010 01:13, schrieb Martin Buchholz:
> Interesting benchmark results!
>
> Your microbenchmark technique looks unusual, but seems to work.
>    

- yes, warmup is integrated without need for coding extra loop
-- here the more sophisticated version to detect slowdowns caused by GC, 
Hotspot or OS activity (line 245 ...):
    
https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/branches/j7_EUC_TW/src/sun/nio/cs/ext/EUC_TWBenchmark.java?rev=1008&view=markup
- my technique mostly eliminates influences of irregular system slowdowns
-- this is of special importance on mobile systems, where CPU clock 
could become throttled caused from overheating

> I'm surprised there is that much difference.
>    

I wasn't after studying the disassemblies.

> I would take out the swallowing of Exception.
>    

Thanx, caused by copy'n paste. ;-)

> ---
>
> Your data contains only supplementary characters,
> which we are assuming are very rare.
>    

Yes, on BMP characters all variations have same speed.

> So I don't consider speeding up such a benchmark
> very important,

Yes, even on taiwanese machines, using EUC_TW, surrogates frequency may 
be 1 %, but character processing at all should be frequent on several 
applications.
On the other hand, frequency of other APIs, e.g. array sort, even should 
be low on overall applications. Does that justify stepmotherly 
maintained code, in particular if we can have it for more or less nothing.

>   but....
>
> We can do it for free
> by switching isSupplementaryCodePoint =>  isValidCodePoint,
> so why not?
>    

Yep, I stepmotherly revised other methods, my focus was on String(int[], 
int, int) and outsourcing bond checks.
BTW, what do you think of the latter?

> ---
>
> While checking this, I noticed that Character.toChars can
> be sped up by using our new isBMPCodePoint method
> (always optimize for BMP!)
>    

I guess you have noticed, that the main change I just have done earlier.
But I couldn't imagine, that we would drop the optimized form of 
isSupplementaryCodePoint(). ;-)

> ---
>
> Here's the change I'm making on top of isBMPCodePoint:
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2/
>
> Ulf, please review.
>    

- have you noticed my change on toSurrogates() in returning the count? 
Useful too in AbstractStringBuilder#appendCodePoint() and plenty of 
sun.nio.cs decoders.

- A little "bug" in javadoc:
   @exception ArrayIndexOutOfBoundsException
   instead    IndexOutOfBoundsException

- String#indexOf(int, int):
              // handle supplementary characters here
              char high = Character.highSurrogate(ch);
              char low = Character.lowSurrogate(ch);
              for ( ; i < max-1; i++)
                  if (v[i] == high && v[i+1] == low)
                          return i - offset;


-Ulf


From Ulf.Zibis at gmx.de  Sat Mar 20 22:05:53 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 20 Mar 2010 23:05:53 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
Message-ID: <4BA546C1.5040201@gmx.de>

Good catch!
Additionally consider my additional twiddling on indexOf.

-Ulf


Am 20.03.2010 19:36, schrieb Martin Buchholz:
> For a change, here's an actual plain old "incorrect result" bug fix
> for String.lastIndexOf
>
> Sherman, please file a bug and review.
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf/
>
> Also includes our usual performance-oriented fiddling.
>
> public class LastIndexOf {
>      public static void main(String[] args) {
>          int ch = 0x10042;
>          char[] bug = new char[3];
>          Character.toChars(ch, bug, 0);
>          bug[2] = bug[0];
>          System.out.println(new String(bug).lastIndexOf(ch));
>          bug[2] = '!';
>          System.out.println(new String(bug).lastIndexOf(ch));
>      }
> }
> ==>  javac -source 1.6 -Xlint:all LastIndexOf.java
> ==>  java -esa -ea LastIndexOf
> -1
> 0
>
>
>    


From Ulf.Zibis at gmx.de  Sat Mar 20 22:50:49 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 20 Mar 2010 23:50:49 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA546C1.5040201@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA546C1.5040201@gmx.de>
Message-ID: <4BA55149.5070603@gmx.de>

Oops, later I looked in your webrev and saw your same idea at same time 
while I was composing my before-last email.

Why don't you outsource indexOfBMP, lastIndexOfBMP, or to be sincere IMO 
to much source code + byte code overhead for a only once used 3-liner.

I suspect if all the finals will have any benefit. Some time ago I too 
felt in that trap, or am I wrong. Examine the disassambly.

-Ulf


Am 20.03.2010 23:05, schrieb Ulf Zibis:
> Good catch!
> Additionally consider my additional twiddling on indexOf.
>
> -Ulf
>
>
> Am 20.03.2010 19:36, schrieb Martin Buchholz:
>> For a change, here's an actual plain old "incorrect result" bug fix
>> for String.lastIndexOf
>>
>> Sherman, please file a bug and review.
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf/
>>
>> Also includes our usual performance-oriented fiddling.
>>
>> public class LastIndexOf {
>>      public static void main(String[] args) {
>>          int ch = 0x10042;
>>          char[] bug = new char[3];
>>          Character.toChars(ch, bug, 0);
>>          bug[2] = bug[0];
>>          System.out.println(new String(bug).lastIndexOf(ch));
>>          bug[2] = '!';
>>          System.out.println(new String(bug).lastIndexOf(ch));
>>      }
>> }
>> ==>  javac -source 1.6 -Xlint:all LastIndexOf.java
>> ==>  java -esa -ea LastIndexOf
>> -1
>> 0
>>
>>
>
>


From Ulf.Zibis at gmx.de  Sat Mar 20 23:50:30 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 00:50:30 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003191717g5ff27fd1m100985fd780fabc4@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>	
	<4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191717g5ff27fd1m100985fd780fabc4@mail.gmail.com>
Message-ID: <4BA55F46.7010501@gmx.de>

Am 20.03.2010 01:17, schrieb Martin Buchholz:
> On Fri, Mar 19, 2010 at 14:46, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Now I have two patches in my mq queue.
>> Martin, how do I create 2 exports in the form, you would like?
>>      
> Just copy the patch files to some public web-accessible place,
> as I do with cr.openjdk.java.net.

Just looked into .hg\patches.
Didn't imagine, that things could be so simple.
Would be happy to have access to cr.openjdk.java.net.

-Ulf


From Ulf.Zibis at gmx.de  Sun Mar 21 00:13:47 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 01:13:47 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>	
	<4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
Message-ID: <4BA564BB.9090901@gmx.de>

Am 20.03.2010 01:13, schrieb Martin Buchholz:
> We can do it for free
> by switching  =>  isValidCodePoint,
> so why not?
>    

Don't you think we should add a hint to javadoc to inform the user about 
the implementation difference between isSupplementaryCodePoint and 
isValidCodePoint?

It's likely, the user would use isBMPCodePoint and 
isSupplementaryCodePoint as pair, not knowing about the performance problem.

But more I would like to see 6932837 && 6935994 gets fixed, so we could 
stay on conservative style.

-Ulf


From Ulf.Zibis at gmx.de  Sun Mar 21 00:18:15 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 01:18:15 +0100
Subject: Centralize bounds check in package private methods
Message-ID: <4BA565C7.9070103@gmx.de>

What do you think about?
See attachment.

-Ulf

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: String_checkBounds
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100321/07ebcc75/String_checkBounds.ksh>

From Ulf.Zibis at gmx.de  Sun Mar 21 00:24:41 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 01:24:41 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
Message-ID: <4BA56749.8020506@gmx.de>

Sherman, please again consider about shifting Surrogate.high/low to 
Character.high/lowSurrogate.

-Ulf


Am 20.03.2010 19:36, schrieb Martin Buchholz:
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf/
>
>
>    


From Ulf.Zibis at gmx.de  Sun Mar 21 00:30:59 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 01:30:59 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003192201s222a233fi6f045d4d0e88febb@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<1ccfd1c11003192201s222a233fi6f045d4d0e88febb@mail.gmail.com>
Message-ID: <4BA568C3.8040705@gmx.de>

Fast path for BMP. +1

I don't find ensureCapacityInternal() ???

shorter:
     public AbstractStringBuilder appendCodePoint(int codePoint) {
         int count = this.count;
         if (Character.isBMPCodePoint(codePoint)) {
             ensureCapacityInternal(count + 1);
             value[count++] = (char) codePoint;
         } else if (Character.isValidCodePoint(codePoint)) {
             ensureCapacityInternal(count + 2);
             count += Character.toSurrogates(codePoint, value, count);
         } else
             throw new IllegalArgumentException();
         return this;
     }

-Ulf


Am 20.03.2010 06:01, schrieb Martin Buchholz:
> Here's another little improvement that should use isBMPCodePoint:
>
> diff --git a/src/share/classes/java/lang/AbstractStringBuilder.java
> b/src/share/classes/java/lang/AbstractStringBuilder.java
> --- a/src/share/classes/java/lang/AbstractStringBuilder.java
> +++ b/src/share/classes/java/lang/AbstractStringBuilder.java
> @@ -719,20 +719,17 @@
>        * {@code codePoint} isn't a valid Unicode code point
>        */
>       public AbstractStringBuilder appendCodePoint(int codePoint) {
> -        if (!Character.isValidCodePoint(codePoint)) {
> +        if (Character.isBMPCodePoint(codePoint)) {
> +            ensureCapacityInternal(count + 1);
> +            value[count] = (char) codePoint;
> +            count += 1;
> +        } else if (Character.isValidCodePoint(codePoint)) {
> +            ensureCapacityInternal(count + 2);
> +            Character.toSurrogates(codePoint, value, count);
> +            count += 2;
> +        } else {
>               throw new IllegalArgumentException();
>           }
> -        int n = 1;
> -        if (codePoint>= Character.MIN_SUPPLEMENTARY_CODE_POINT) {
> -            n++;
> -        }
> -        ensureCapacityInternal(count + n);
> -        if (n == 1) {
> -            value[count++] = (char) codePoint;
> -        } else {
> -            Character.toSurrogates(codePoint, value, count);
> -            count += n;
> -        }
>           return this;
>       }
>
>
> Martin
>
>
>    


From martinrb at google.com  Sun Mar 21 07:14:44 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 00:14:44 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA568C3.8040705@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<1ccfd1c11003192201s222a233fi6f045d4d0e88febb@mail.gmail.com>
	<4BA568C3.8040705@gmx.de>
Message-ID: <1ccfd1c11003210014q5dcb6d6fj6e4bfc0975d6e98c@mail.gmail.com>

On Sat, Mar 20, 2010 at 17:30, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Fast path for BMP. +1
>
> I don't find ensureCapacityInternal() ???

My patches are dependent on each other.
I've changed my patch publishing script to
publish my entire .hg/patches/ subrepo, so that
others can import my patches, including their order.

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/patches/

> shorter:
> ? ?public AbstractStringBuilder appendCodePoint(int codePoint) {
> ? ? ? ?int count = this.count;

You need to update this.count, not count, below.
To avoid this very common class of bugs,
I use final when I cache a field in a local of the same name.

Anyways, I adopted your caching of count, in
isBMPCodePoint2.

> ? ? ? ?if (Character.isBMPCodePoint(codePoint)) {
> ? ? ? ? ? ?ensureCapacityInternal(count + 1);
> ? ? ? ? ? ?value[count++] = (char) codePoint;
> ? ? ? ?} else if (Character.isValidCodePoint(codePoint)) {
> ? ? ? ? ? ?ensureCapacityInternal(count + 2);
> ? ? ? ? ? ?count += Character.toSurrogates(codePoint, value, count);

I'm sorry, I dislike methods that always return the same
value, just to make some client code a little shorter.
Character.toSurrogates should return void.

Martin


From martinrb at google.com  Sun Mar 21 07:20:20 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 00:20:20 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA564BB.9090901@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA564BB.9090901@gmx.de>
Message-ID: <1ccfd1c11003210020l1c200f89h95541d120fdc08cb@mail.gmail.com>

On Sat, Mar 20, 2010 at 17:13, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 20.03.2010 01:13, schrieb Martin Buchholz:
> Don't you think we should add a hint to javadoc to inform the user about the
> implementation difference between isSupplementaryCodePoint and
> isValidCodePoint?

No.

> It's likely, the user would use isBMPCodePoint and isSupplementaryCodePoint
> as pair, not knowing about the performance problem.

I don't think it's a performance problem in the real world.

We don't usually put such performance information in the javadoc.

Can you demonstrate a performance advantage
of your implementation of isSupplementaryCodePoint
for BMP characters, when there is no call to
isBMPCodePoint?  (Such a demonstration typically
requires testing on a large variety of systems and JITs)

Martin


From martinrb at google.com  Sun Mar 21 07:24:02 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 00:24:02 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA55F46.7010501@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191717g5ff27fd1m100985fd780fabc4@mail.gmail.com>
	<4BA55F46.7010501@gmx.de>
Message-ID: <1ccfd1c11003210024y2097a2acmfd0b729871f02397@mail.gmail.com>

On Sat, Mar 20, 2010 at 16:50, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 20.03.2010 01:17, schrieb Martin Buchholz:
>>
>> On Fri, Mar 19, 2010 at 14:46, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>
>>>
>>> Now I have two patches in my mq queue.
>>> Martin, how do I create 2 exports in the form, you would like?
>>>
>>
>> Just copy the patch files to some public web-accessible place,
>> as I do with cr.openjdk.java.net.
>
> Just looked into .hg\patches.
> Didn't imagine, that things could be so simple.

mercurial does try to have mechanisms as simple as possible.
I really do think of my in-progress jdk development
as being the set of patch files in the .hg/patches directory.

Martin


From martinrb at google.com  Sun Mar 21 07:56:53 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 00:56:53 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA543A0.2060600@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>
	<4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
Message-ID: <1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>

On Sat, Mar 20, 2010 at 14:52, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 20.03.2010 01:13, schrieb Martin Buchholz:
> Yep, I stepmotherly revised other methods, my focus was on String(int[],
> int, int) and outsourcing bond checks.
> BTW, what do you think of the latter?

I've done a lot of work on "out-lining" error handling.
The state of the art appears to be (from LinkedList)


    /**
     * Tells if the argument is the index of an existing element.
     */
    private boolean isElementIndex(int index) {
        return index >= 0 && index < size;
    }

    /**
     * Tells if the argument is the index of a valid position for an
     * iterator or an add operation.
     */
    private boolean isPositionIndex(int index) {
        return index >= 0 && index <= size;
    }

    /**
     * Constructs an IndexOutOfBoundsException detail message.
     * Of the many possible refactorings of the error handling code,
     * this "outlining" performs best with both server and client VMs.
     */
    private String outOfBoundsMsg(int index) {
        return "Index: "+index+", Size: "+size;
    }

    private void checkElementIndex(int index) {
        if (!isElementIndex(index))
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

    private void checkPositionIndex(int index) {
        if (!isPositionIndex(index))
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

Study also:
http://code.google.com/p/google-collections/source/browse/trunk/src/com/google/common/base/Preconditions.java


> - A little "bug" in javadoc:
> ?@exception ArrayIndexOutOfBoundsException
> ?instead ? ?IndexOutOfBoundsException

Not a bug.
You do realize AIOOBE is a subclass of IOOBE?

> - String#indexOf(int, int):
> ? ? ? ? ? ? // handle supplementary characters here
> ? ? ? ? ? ? char high = Character.highSurrogate(ch);
> ? ? ? ? ? ? char low = Character.lowSurrogate(ch);
> ? ? ? ? ? ? for ( ; i < max-1; i++)
> ? ? ? ? ? ? ? ? if (v[i] == high && v[i+1] == low)
> ? ? ? ? ? ? ? ? ? ? ? ? return i - offset;

I now believe we should provide
Character.highSurrogate and Character.lowSurrogate
as you have been advocating.

If Sherman agrees, let's put a proper patch for this together.

Martin


From martinrb at google.com  Sun Mar 21 08:05:52 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 01:05:52 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA55149.5070603@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA546C1.5040201@gmx.de> <4BA55149.5070603@gmx.de>
Message-ID: <1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>

On Sat, Mar 20, 2010 at 15:50, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Oops, later I looked in your webrev and saw your same idea at same time
> while I was composing my before-last email.
>
> Why don't you outsource indexOfBMP, lastIndexOfBMP, or to be sincere IMO to
> much source code + byte code overhead for a only once used 3-liner.

I'm not sure I understand your intent.

> I suspect if all the finals will have any benefit. Some time ago I too felt
> in that trap, or am I wrong. Examine the disassambly.

My use of "final" is almost always for software engineering reasons,
not for performance reasons.

Martin


From martinrb at google.com  Sun Mar 21 08:16:05 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 01:16:05 -0700
Subject: Centralize bounds check in package private methods
In-Reply-To: <4BA565C7.9070103@gmx.de>
References: <4BA565C7.9070103@gmx.de>
Message-ID: <1ccfd1c11003210116u571a863ew8ca8d5104e4e8997@mail.gmail.com>

This is definitely on the right track.
We should borrow as many of the ideas and
method names from LinkedList and Preconditions.
But compatibility will force us to maintain our own
versions of these methods.

Someday I'd like to see much of
google-collections/guava in the jdk.
Particularly Preconditions.

Martin

On Sat, Mar 20, 2010 at 17:18, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> What do you think about?
> See attachment.
>
> -Ulf
>
>


From Ulf.Zibis at gmx.de  Sun Mar 21 10:00:39 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 11:00:39 +0100
Subject: Centralize bounds check in package private methods
In-Reply-To: <1ccfd1c11003210116u571a863ew8ca8d5104e4e8997@mail.gmail.com>
References: <4BA565C7.9070103@gmx.de>
	<1ccfd1c11003210116u571a863ew8ca8d5104e4e8997@mail.gmail.com>
Message-ID: <4BA5EE47.6070001@gmx.de>

That's great, gives me more time for other work.
Preconditions seems good candidate for java.util package.

Some namings appear little strange for me. Example:

     String  errorMessageTemplate,  Object...  errorMessageArgs
Why not just:
      errorFormat,  errorArgs
      messageFormat,  messageArgs

Term 'format' is known from printf.


Have you noticed the "creative" variations on messages in class AbstractStringBuilder?

-Ulf


Am 21.03.2010 09:16, schrieb Martin Buchholz:
> This is definitely on the right track.
> We should borrow as many of the ideas and
> method names from LinkedList and Preconditions.
> But compatibility will force us to maintain our own
> versions of these methods.
>
> Someday I'd like to see much of
> google-collections/guava in the jdk.
> Particularly Preconditions.
>
> Martin
>
> On Sat, Mar 20, 2010 at 17:18, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> What do you think about?
>> See attachment.
>>
>> -Ulf
>>
>>
>>      
>
>    


From Ulf.Zibis at gmx.de  Sun Mar 21 10:13:25 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 11:13:25 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
Message-ID: <4BA5F145.2020408@gmx.de>

Am 21.03.2010 08:56, schrieb Martin Buchholz:
> On Sat, Mar 20, 2010 at 14:52, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 20.03.2010 01:13, schrieb Martin Buchholz:
>> Yep, I stepmotherly revised other methods, my focus was on String(int[],
>> int, int) and outsourcing bond checks.
>> BTW, what do you think of the latter?
>>      
> I've done a lot of work on "out-lining" error handling.
> The state of the art appears to be (from LinkedList)
>    

Additionally in my brain stack there is cooking the idea, having final 
class AbstractString/AbstractCharsequence, where e.g. String and 
AbstractStringBuilder could inherit from.
There are several methods, that could be centralized, and, if 
appropriate, subclassed.
Several static methods of class Character could find their code home here.

-Ulf


From Ulf.Zibis at gmx.de  Sun Mar 21 10:24:49 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 11:24:49 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA546C1.5040201@gmx.de> <4BA55149.5070603@gmx.de>
	<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>
Message-ID: <4BA5F3F1.8080609@gmx.de>

Am 21.03.2010 09:05, schrieb Martin Buchholz:
> On Sat, Mar 20, 2010 at 15:50, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Why don't you outsource indexOfBMP, lastIndexOfBMP, or to be sincere IMO to
>> much source code + byte code overhead for a only once used 3-liner.
>>      
> I'm not sure I understand your intent.
>    

I think, we should not define a distinct method for this once-used 3-liner:
              for (; i < max-1; i++)
                  if (v[i] == high && v[i+1] == low)
                          return i - offset;

HotSpots resources should not be over-stressed to inline such things, 
having more reserves for more important things.


>    
>> I suspect if all the finals will have any benefit. Some time ago I too felt
>> in that trap, or am I wrong. Examine the disassambly.
>>      
> My use of "final" is almost always for software engineering reasons,
> not for performance reasons.
>    

Ah, ok, just a kind of coding style.

-Ulf


From Ulf.Zibis at gmx.de  Sun Mar 21 10:29:04 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 11:29:04 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003210014q5dcb6d6fj6e4bfc0975d6e98c@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<1ccfd1c11003192201s222a233fi6f045d4d0e88febb@mail.gmail.com>	
	<4BA568C3.8040705@gmx.de>
	<1ccfd1c11003210014q5dcb6d6fj6e4bfc0975d6e98c@mail.gmail.com>
Message-ID: <4BA5F4F0.7040905@gmx.de>

Am 21.03.2010 08:14, schrieb Martin Buchholz:
> On Sat, Mar 20, 2010 at 17:30, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>
> You need to update this.count, not count, below.
> To avoid this very common class of bugs,
> I use final when I cache a field in a local of the same name.

Oops, is was late last night.

-Ulf


From Ulf.Zibis at gmx.de  Sun Mar 21 11:28:57 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 12:28:57 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003210020l1c200f89h95541d120fdc08cb@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA564BB.9090901@gmx.de>
	<1ccfd1c11003210020l1c200f89h95541d120fdc08cb@mail.gmail.com>
Message-ID: <4BA602F9.7000408@gmx.de>

>
> On Sat, Mar 20, 2010 at 17:13, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 20.03.2010 01:13, schrieb Martin Buchholz:
>> Don't you think we should add a hint to javadoc to inform the user about the
>> implementation difference between isSupplementaryCodePoint and
>> isValidCodePoint?
>>      
> No.
>
>    
>> It's likely, the user would use isBMPCodePoint and isSupplementaryCodePoint
>> as pair, not knowing about the performance problem.
>>      
> I don't think it's a performance problem in the real world.
>    

Hm, if someone uses:
      if (Character.isBMPCodePoint(codePoint))
          ...;
      else if (Character.isSupplementaryCodePoint(codePoint)) // instead 
isValidCodepoint()
          ...;
      else
          ...;
he will loose up to 50 % performance as you can see on my benchmark on 
isSuppCPAlaMartin().


> We don't usually put such performance information in the javadoc.
>    

In class StringBuilder:
"Where possible, it is recommended that this class be used in preference 
to |StringBuffer| as it will be faster under most implementations."

java.util.List:
Note that these operations may execute in time proportional to the index 
value for some implementations (the LinkedList class, for example).

ByteBuffer#get(byte[],int,int):
In other words, an invocation of this method of the form 
src.get(dst, off, len) has exactly the same effect as the loop

      for (int i = off; i<  off + len; i++)
          dst[i] = src.get();

except that it first checks that there are sufficient bytes in this 
buffer *and it is potentially much more efficient*.

**

> Can you demonstrate a performance advantage
> of your implementation of isSupplementaryCodePoint
> for BMP characters, when there is no call to
> isBMPCodePoint?  (Such a demonstration typically
> requires testing on a large variety of systems and JITs)
>    

I'm not sure if I understand right.
I think, my benchmark on isSuppCPAlaMartin() would demonstrate that.

Anyway, even if isSupplementaryCodePoint() is used isolated, my code 
will help JIT to use 2-byte shifted adressing and shorter 2-byte 
immediate value for the compare, but yes, JIT should be able to catch 
that without this help. But for that case, we could stay on the old 
implementations too for isBMPCodePoint and is ValidCodePoint.


-Ulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100321/6874ddbc/attachment.html>

From Ulf.Zibis at gmx.de  Sun Mar 21 12:00:00 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 13:00:00 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
Message-ID: <4BA60A40.9050600@gmx.de>

>
> On Sat, Mar 20, 2010 at 14:52, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> - A little "bug" in javadoc:
>>   @exception ArrayIndexOutOfBoundsException
>>   instead    IndexOutOfBoundsException
>>      
> Not a bug.
>    

Yes, but decreases the users capabilities catching exceptions more 
precise and flexible.
Imagine, a method would throw an IndexOutOfBoundsException for some 
reason and too calls Character.toChars(). The caller of such a method 
could distinguish, where the exception would come from, and have 
separate catch blocks. But if not documented ... :-(

In extreme, following too would not be a bug in your sense:
   @exception Exception

I became sensitive on this, as I have seen real bugs in 
AbstractStringBuilder vice versa, where methods actually throw 
IndexOutOfBoundsExceptions, but their javadoc states StringIndexOutOf 
BoundsException.

Would be a nice game for easter, inviting people to search for those 
bugs in JDK code base, than for coloured eggs.


> You do realize AIOOBE is a subclass of IOOBE?
>    

Yes.


-Ulf


From martinrb at google.com  Sun Mar 21 12:35:04 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 05:35:04 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA5F3F1.8080609@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA546C1.5040201@gmx.de> <4BA55149.5070603@gmx.de>
	<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>
	<4BA5F3F1.8080609@gmx.de>
Message-ID: <1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>

On Sun, Mar 21, 2010 at 03:24, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 21.03.2010 09:05, schrieb Martin Buchholz:
>>
>> On Sat, Mar 20, 2010 at 15:50, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
> I think, we should not define a distinct method for this once-used 3-liner:
> ? ? ? ? ? ? for (; i < max-1; i++)
> ? ? ? ? ? ? ? ? if (v[i] == high && v[i+1] == low)
> ? ? ? ? ? ? ? ? ? ? ? ? return i - offset;
>
> HotSpots resources should not be over-stressed to inline such things, having
> more reserves for more important things.

On the contrary -
normally the above code snippet will rarely be executed,
and so will normally not be inlined into the caller,
which makes it easier for hotspot to inline
the caller into its caller.  Separate cold code into
separate methods.

BTW, in case you try to benchmark this,
hotspot intrinsifies indexOf by default.

Martin


From Ulf.Zibis at gmx.de  Sun Mar 21 12:53:34 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 13:53:34 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4A9578C4.8060801@sun.com>	
	<4B8DA070.3040306@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
Message-ID: <4BA616CE.3090003@gmx.de>

>
> diff --git a/src/share/classes/sun/nio/cs/Surrogate.java
> b/src/share/classes/sun/nio/cs/Surrogate.java
> --- a/src/share/classes/sun/nio/cs/Surrogate.java
> +++ b/src/share/classes/sun/nio/cs/Surrogate.java
> @@ -294,7 +294,7 @@
>                   dst.put((char)uc);
>                   error = null;
>                   return 1;
> -            } else if (Character.isSupplementaryCodePoint(uc)) {
> +            } else if (Character.isValidCodePoint(uc)) {
>                   if (dst.remaining()<  2) {
>                       error = CoderResult.OVERFLOW;
>                       return -1;
> @@ -338,7 +338,7 @@
>                   da[dp] = (char)uc;
>                   error = null;
>                   return 1;
> -            } else if (Character.isSupplementaryCodePoint(uc)) {
> +            } else if (Character.isValidCodePoint(uc)) {
>                   if (dl - dp<  2) {
>                       error = CoderResult.OVERFLOW;
>                       return -1;
>
>    

Have you searched for usages of Surrogate.isNeededFor() and 
Character.isSupplementaryCodePoint() in sun.nio.cs.**.* and elsewhere?
If used paired with *.isBMP.* it should be replaced by Surrogate.isBMP() 
/ Character.isBMPCodePoint().

-Ulf


From Ulf.Zibis at gmx.de  Sun Mar 21 13:06:30 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 14:06:30 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4BA616CE.3090003@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<4A9578C4.8060801@sun.com>		<4B8DA070.3040306@gmx.de>		<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>		<4B8E3DA3.7090902@gmx.de>		<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>		<4B9FE4DD.1090405@sun.com>		<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>		<4BA007A4.2030907@sun.com>
	<4BA3F0B5.1070404@gmx.de>	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA616CE.3090003@gmx.de>
Message-ID: <4BA619D6.3080907@gmx.de>

>
>>
>> diff --git a/src/share/classes/sun/nio/cs/Surrogate.java
>> b/src/share/classes/sun/nio/cs/Surrogate.java
>> --- a/src/share/classes/sun/nio/cs/Surrogate.java
>> +++ b/src/share/classes/sun/nio/cs/Surrogate.java
>> @@ -294,7 +294,7 @@
>>                   dst.put((char)uc);
>>                   error = null;
>>                   return 1;
>> -            } else if (Character.isSupplementaryCodePoint(uc)) {
>> +            } else if (Character.isValidCodePoint(uc)) {
>>                   if (dst.remaining()<  2) {
>>                       error = CoderResult.OVERFLOW;
>>                       return -1;
>> @@ -338,7 +338,7 @@
>>                   da[dp] = (char)uc;
>>                   error = null;
>>                   return 1;
>> -            } else if (Character.isSupplementaryCodePoint(uc)) {
>> +            } else if (Character.isValidCodePoint(uc)) {
>>                   if (dl - dp<  2) {
>>                       error = CoderResult.OVERFLOW;
>>                       return -1;
>>
>
> Have you searched for usages of Surrogate.isNeededFor() and 
> Character.isSupplementaryCodePoint() in sun.nio.cs.**.* and elsewhere?
> If used paired with *.isBMP.* it should be replaced by 
> Surrogate.isBMP() / Character.isBMPCodePoint().
>
> -Ulf

correction:
    If used paired with *.isBMP.* it should be replaced by 
Surrogate.is() / Character.isValidCodePoint().


From Ulf.Zibis at gmx.de  Sun Mar 21 13:16:48 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 14:16:48 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA546C1.5040201@gmx.de> <4BA55149.5070603@gmx.de>	
	<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>	
	<4BA5F3F1.8080609@gmx.de>
	<1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>
Message-ID: <4BA61C40.8060608@gmx.de>

>
> On Sun, Mar 21, 2010 at 03:24, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 21.03.2010 09:05, schrieb Martin Buchholz:
>>      
>>> On Sat, Mar 20, 2010 at 15:50, Ulf Zibis<Ulf.Zibis at gmx.de>    wrote:
>>>        
>> I think, we should not define a distinct method for this once-used 3-liner:
>>              for (; i<  max-1; i++)
>>                  if (v[i] == high&&  v[i+1] == low)
>>                          return i - offset;
>>
>> HotSpots resources should not be over-stressed to inline such things, having
>> more reserves for more important things.
>>      
> On the contrary -
> normally the above code snippet will rarely be executed,
> and so will normally not be inlined into the caller,
> which makes it easier for hotspot to inline
> the caller into its caller.  Separate cold code into
> separate methods.
>    

Thanks, I got the idea.

But Isn't the push-call-pop-return overhead comparable with those 3 
lines here, not to forget the repeated cache-3-values-once-more?

-Ulf


> BTW, in case you try to benchmark this,
> hotspot intrinsifies indexOf by default.
>
> Martin
>
>
>    


From Ulf.Zibis at gmx.de  Sun Mar 21 13:35:33 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 14:35:33 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA61C40.8060608@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>		<4BA546C1.5040201@gmx.de>
	<4BA55149.5070603@gmx.de>		<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>		<4BA5F3F1.8080609@gmx.de>	<1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>
	<4BA61C40.8060608@gmx.de>
Message-ID: <4BA620A5.5020608@gmx.de>

>
>>
>> On Sun, Mar 21, 2010 at 03:24, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>>> Am 21.03.2010 09:05, schrieb Martin Buchholz:
>>>> On Sat, Mar 20, 2010 at 15:50, Ulf Zibis<Ulf.Zibis at gmx.de>    wrote:
>>> I think, we should not define a distinct method for this once-used 
>>> 3-liner:
>>>              for (; i<  max-1; i++)
>>>                  if (v[i] == high&&  v[i+1] == low)
>>>                          return i - offset;
>>>
>>> HotSpots resources should not be over-stressed to inline such 
>>> things, having
>>> more reserves for more important things.
>> On the contrary -
>> normally the above code snippet will rarely be executed,
>> and so will normally not be inlined into the caller,
>> which makes it easier for hotspot to inline
>> the caller into its caller.  Separate cold code into
>> separate methods.
>
> Thanks, I got the idea.
>
> But Isn't the push-call-pop-return overhead comparable with those 3 
> lines here, not to forget the repeated cache-3-values-once-more?

And additionally the slow rarely used branch would stay in stone, even 
if after some time, the inline threshhold becomes reached, as JIT, 
AFAIK, can't count the frequency of compiled code usage.

-Ulf


From Ulf.Zibis at gmx.de  Sun Mar 21 15:42:26 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sun, 21 Mar 2010 16:42:26 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003030000l6adaddd7wd2084fb29a6cda83@mail.gmail.com>	
	<4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
Message-ID: <4BA63E62.2010304@gmx.de>

Am 21.03.2010 08:56, schrieb Martin Buchholz:
> On Sat, Mar 20, 2010 at 14:52, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>
> I now believe we should provide
> Character.highSurrogate and Character.lowSurrogate
> as you have been advocating.
>
> If Sherman agrees, let's put a proper patch for this together.
>    

- I too would move the charCount logic from String(int[], int, int) to 
class Character, at least as package private helper method. There just 
is another charCount method in good neighbourhood.
- Additionally, may be a logic to handle invalid surrogate code points 
would be interesting.

I've attached the newest version of my patch, which you can compare with 
your current state, ignoring some style differences etc.

-Ulf

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Character_charCount
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100321/5d396e58/Character_charCount.ksh>

From martinrb at google.com  Sun Mar 21 16:16:35 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 09:16:35 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA60A40.9050600@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
	<4BA60A40.9050600@gmx.de>
Message-ID: <1ccfd1c11003210916rad35d31wb0501f9bf960b07@mail.gmail.com>

On Sun, Mar 21, 2010 at 05:00, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>
>> On Sat, Mar 20, 2010 at 14:52, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>
>>>
>>> - A little "bug" in javadoc:
>>> ?@exception ArrayIndexOutOfBoundsException
>>> ?instead ? ?IndexOutOfBoundsException
>>>
>>
>> Not a bug.
>>
>
> Yes, but decreases the users capabilities catching exceptions more precise
> and flexible.

There is a debate about whether to reuse existing exception classes
or to throw class-specific subclasses.  IMO, IOOBE is a sufficiently expressive
exception that I might have used just that, with expressive detail messages.

But that's only a consideration when designing new API or a new platform.
Old API must stay unchanged, for compatibility.

> Imagine, a method would throw an IndexOutOfBoundsException for some reason
> and too calls Character.toChars(). The caller of such a method could
> distinguish, where the exception would come from, and have separate catch
> blocks. But if not documented ... :-(
>
> In extreme, following too would not be a bug in your sense:
> ?@exception Exception
>
> I became sensitive on this, as I have seen real bugs in
> AbstractStringBuilder vice versa, where methods actually throw
> IndexOutOfBoundsExceptions, but their javadoc states StringIndexOutOf
> BoundsException.

Now that's a real bug.

Martin


From martinrb at google.com  Sun Mar 21 16:23:28 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 09:23:28 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA602F9.7000408@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA564BB.9090901@gmx.de>
	<1ccfd1c11003210020l1c200f89h95541d120fdc08cb@mail.gmail.com>
	<4BA602F9.7000408@gmx.de>
Message-ID: <1ccfd1c11003210923q27e4d8bdj913350d8bca58195@mail.gmail.com>

On Sun, Mar 21, 2010 at 04:28, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> On Sat, Mar 20, 2010 at 17:13, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:

> I don't think it's a performance problem in the real world.
>
>
> Hm, if someone uses:
> ???? if (Character.isBMPCodePoint(codePoint))
> ???????? ...;
> ???? else if (Character.isSupplementaryCodePoint(codePoint)) // instead
> isValidCodepoint()
> ???????? ...;
> ???? else
> ???????? ...;
> he will loose up to 50 % performance as you can see on my benchmark on
> isSuppCPAlaMartin().

Only if their data is full of supplementary characters.

> We don't usually put such performance information in the javadoc.
>
>
> In class StringBuilder:
> "Where possible, it is recommended that this class be used in preference to
> StringBuffer as it will be faster under most implementations."
>
> java.util.List:
> Note that these operations may execute in time proportional to the index
> value for some implementations (the LinkedList class, for example).
>
> ByteBuffer#get(byte[],int,int):
> In other words, an invocation of this method of the form
> src.get(dst,?off,?len) has exactly the same effect as the loop
>
>      for (int i = off; i < off + len; i++)
>          dst[i] = src.get();
>
> except that it first checks that there are sufficient bytes in this buffer
> and it is potentially much more efficient.

In the above, the performance is a Raison d'?tre of the API,
that real users should consider when choosing API.

> Anyway, even if isSupplementaryCodePoint() is used isolated, my code will
> help JIT to use 2-byte shifted adressing and shorter 2-byte immediate value
> for the compare, but yes, JIT should be able to catch that without this
> help. But for that case, we could stay on the old implementations too for
> isBMPCodePoint and is ValidCodePoint.

Again, performance with BMP characters is infinitely more important
than performance with supplementary characters.

Martin


From martinrb at google.com  Sun Mar 21 17:38:04 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 10:38:04 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA620A5.5020608@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA546C1.5040201@gmx.de> <4BA55149.5070603@gmx.de>
	<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>
	<4BA5F3F1.8080609@gmx.de>
	<1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>
	<4BA61C40.8060608@gmx.de> <4BA620A5.5020608@gmx.de>
Message-ID: <1ccfd1c11003211038p5681441cqf7ba500b9f8079e5@mail.gmail.com>

On Sun, Mar 21, 2010 at 06:35, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>>
>>>
>>> On Sun, Mar 21, 2010 at 03:24, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>>>
>>>> Am 21.03.2010 09:05, schrieb Martin Buchholz:
>>>>>
>>>>> On Sat, Mar 20, 2010 at 15:50, Ulf Zibis<Ulf.Zibis at gmx.de> ? ?wrote:
>>>>
>>>> I think, we should not define a distinct method for this once-used
>>>> 3-liner:
>>>> ? ? ? ? ? ? for (; i< ?max-1; i++)
>>>> ? ? ? ? ? ? ? ? if (v[i] == high&& ?v[i+1] == low)
>>>> ? ? ? ? ? ? ? ? ? ? ? ? return i - offset;
>>>>
>>>> HotSpots resources should not be over-stressed to inline such things,
>>>> having
>>>> more reserves for more important things.
>>>
>>> On the contrary -
>>> normally the above code snippet will rarely be executed,
>>> and so will normally not be inlined into the caller,
>>> which makes it easier for hotspot to inline
>>> the caller into its caller. ?Separate cold code into
>>> separate methods.
>>
>> Thanks, I got the idea.
>>
>> But Isn't the push-call-pop-return overhead comparable with those 3 lines
>> here, not to forget the repeated cache-3-values-once-more?

Even if I'm wrong, and this cold code is actually hot,
I don't think there will be a big performance loss.
The method call is outside the loop.

> And additionally the slow rarely used branch would stay in stone, even if
> after some time, the inline threshhold becomes reached, as JIT, AFAIK, can't
> count the frequency of compiled code usage.

It's certainly the intent that we will have multiple levels of
compilation ("tiered compilation") and profiling would be
enabled on at least some compiled code.

Martin


From martinrb at google.com  Sun Mar 21 19:39:17 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 21 Mar 2010 12:39:17 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>
	<4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
Message-ID: <1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>

On Sun, Mar 21, 2010 at 00:56, Martin Buchholz <martinrb at google.com> wrote:

> Study also:
> http://code.google.com/p/google-collections/source/browse/trunk/src/com/google/common/base/Preconditions.java

Sorry, the best (most recent) version of Preconditions to study is here:

http://code.google.com/p/guava-libraries/source/browse/trunk/src/com/google/common/base/Preconditions.java

especially this comment:

  /*
   * All recent hotspots (as of 2009) *really* like to have the natural code
   *
   * if (guardExpression) {
   *    throw new BadException(messageExpression);
   * }
   *
   * refactored so that messageExpression is moved to a separate
   * String-returning method.
   *
   * if (guardExpression) {
   *    throw new BadException(badMsg(...));
   * }
   *
   * The alternative natural refactorings into void or Exception-returning
   * methods are much slower.  This is a big deal - we're talking factors of
   * 2-8 in microbenchmarks, not just 10-20%.  (This is a hotspot optimizer
   * bug, which should be fixed, but that's a separate, big project).
   *
   * The coding pattern above is heavily used in java.util, e.g. in ArrayList.
   * There is a RangeCheckMicroBenchmark in the JDK that was used to test this.
   *
   * But the methods in this class want to throw different exceptions,
   * depending on the args, so it appears that this pattern is not directly
   * applicable.  But we can use the ridiculous, devious trick of throwing an
   * exception in the middle of the construction of another exception.
   * Hotspot is fine with that.
   */


Martin


From christopher.hegarty at sun.com  Mon Mar 22 12:00:30 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Mon, 22 Mar 2010 12:00:30 +0000
Subject: hg: jdk7/tl/jdk: 6632169: HttpClient and HttpsClient should not try
	to reverse lookup IP address of a proxy server
Message-ID: <20100322120310.8BCBE4465E@hg.openjdk.java.net>

Changeset: c40572afb29e
Author:    chegar
Date:      2010-03-22 11:55 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/c40572afb29e

6632169: HttpClient and HttpsClient should not try to reverse lookup IP address of a proxy server
Reviewed-by: michaelm

! src/share/classes/sun/net/www/protocol/https/HttpsClient.java


From Ulf.Zibis at gmx.de  Mon Mar 22 13:57:46 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Mon, 22 Mar 2010 14:57:46 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003210923q27e4d8bdj913350d8bca58195@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA564BB.9090901@gmx.de>	
	<1ccfd1c11003210020l1c200f89h95541d120fdc08cb@mail.gmail.com>	
	<4BA602F9.7000408@gmx.de>
	<1ccfd1c11003210923q27e4d8bdj913350d8bca58195@mail.gmail.com>
Message-ID: <4BA7775A.2080506@gmx.de>

Am 21.03.2010 17:23, schrieb Martin Buchholz:
> On Sun, Mar 21, 2010 at 04:28, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> On Sat, Mar 20, 2010 at 17:13, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>>      
>    
>> I don't think it's a performance problem in the real world.
>>
>>
>> Hm, if someone uses:
>>       if (Character.isBMPCodePoint(codePoint))
>>           ...;
>>       else if (Character.isSupplementaryCodePoint(codePoint)) // instead
>> isValidCodepoint()
>>           ...;
>>       else
>>           ...;
>> he will loose up to 50 % performance as you can see on my benchmark on
>> isSuppCPAlaMartin().
>>      
> Only if their data is full of supplementary characters.
>    

Yes, but we dont't know anything about the purpose of code written there 
in the world, so why not provide best performance or at least give a 
hint in the docs, if it doesn't cost anything.

>    
>> We don't usually put such performance information in the javadoc.
>>
>>
>> In class StringBuilder:
>> "Where possible, it is recommended that this class be used in preference to
>> StringBuffer as it will be faster under most implementations."
>>
>> java.util.List:
>> Note that these operations may execute in time proportional to the index
>> value for some implementations (the LinkedList class, for example).
>>
>> ByteBuffer#get(byte[],int,int):
>> In other words, an invocation of this method of the form
>> src.get(dst, off, len) has exactly the same effect as the loop
>>
>>       for (int i = off; i<  off + len; i++)
>>           dst[i] = src.get();
>>
>> except that it first checks that there are sufficient bytes in this buffer
>> and it is potentially much more efficient.
>>      
> In the above, the performance is a Raison d'?tre of the API,
> that real users should consider when choosing API.
>    

Oh, on parle fran?ais. Je l'aime beaucoup.

>    
>> Anyway, even if isSupplementaryCodePoint() is used isolated, my code will
>> help JIT to use 2-byte shifted adressing and shorter 2-byte immediate value
>> for the compare, but yes, JIT should be able to catch that without this
>> help. But for that case, we could stay on the old implementations too for
>> isBMPCodePoint and is ValidCodePoint.
>>      
> Again, performance with BMP characters is infinitely more important
> than performance with supplementary characters.
>    

You are right. But I can't see any reason, why the fast supplementary 
version would harm the BMP performance.

-Ulf


From forax at univ-mlv.fr  Mon Mar 22 14:33:30 2010
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Mon, 22 Mar 2010 15:33:30 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4BA7775A.2080506@gmx.de>
References: <4A95079A.8080803@gmx.de>		<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>		<4B9FE4DD.1090405@sun.com>		<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>		<4BA007A4.2030907@sun.com>
	<4BA3F0B5.1070404@gmx.de>		<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>		<4BA564BB.9090901@gmx.de>		<1ccfd1c11003210020l1c200f89h95541d120fdc08cb@mail.gmail.com>		<4BA602F9.7000408@gmx.de>	<1ccfd1c11003210923q27e4d8bdj913350d8bca58195@mail.gmail.com>
	<4BA7775A.2080506@gmx.de>
Message-ID: <4BA77FBA.9060308@univ-mlv.fr>

Le 22/03/2010 14:57, Ulf Zibis a ?crit :
> Am 21.03.2010 17:23, schrieb Martin Buchholz:
>> On Sun, Mar 21, 2010 at 04:28, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>>> On Sat, Mar 20, 2010 at 17:13, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>>> I don't think it's a performance problem in the real world.
>>>
>>>
>>> Hm, if someone uses:
>>>       if (Character.isBMPCodePoint(codePoint))
>>>           ...;
>>>       else if (Character.isSupplementaryCodePoint(codePoint)) // 
>>> instead
>>> isValidCodepoint()
>>>           ...;
>>>       else
>>>           ...;
>>> he will loose up to 50 % performance as you can see on my benchmark on
>>> isSuppCPAlaMartin().
>> Only if their data is full of supplementary characters.
>
> Yes, but we dont't know anything about the purpose of code written 
> there in the world, so why not provide best performance or at least 
> give a hint in the docs, if it doesn't cost anything.
>
>>> We don't usually put such performance information in the javadoc.
>>>
>>>
>>> In class StringBuilder:
>>> "Where possible, it is recommended that this class be used in 
>>> preference to
>>> StringBuffer as it will be faster under most implementations."
>>>
>>> java.util.List:
>>> Note that these operations may execute in time proportional to the 
>>> index
>>> value for some implementations (the LinkedList class, for example).
>>>
>>> ByteBuffer#get(byte[],int,int):
>>> In other words, an invocation of this method of the form
>>> src.get(dst, off, len) has exactly the same effect as the loop
>>>
>>>       for (int i = off; i<  off + len; i++)
>>>           dst[i] = src.get();
>>>
>>> except that it first checks that there are sufficient bytes in this 
>>> buffer
>>> and it is potentially much more efficient.
>> In the above, the performance is a Raison d'?tre of the API,
>> that real users should consider when choosing API.
>
> Oh, on parle fran?ais. Je l'aime beaucoup.

Totally off topic but
You mean: "j'aime beaucoup".
je l'aime beaucoup means I love him/her a lot.

>
>>> Anyway, even if isSupplementaryCodePoint() is used isolated, my code 
>>> will
>>> help JIT to use 2-byte shifted adressing and shorter 2-byte 
>>> immediate value
>>> for the compare, but yes, JIT should be able to catch that without this
>>> help. But for that case, we could stay on the old implementations 
>>> too for
>>> isBMPCodePoint and is ValidCodePoint.
>> Again, performance with BMP characters is infinitely more important
>> than performance with supplementary characters.
>
> You are right. But I can't see any reason, why the fast supplementary 
> version would harm the BMP performance.
>
> -Ulf
>

R?mi


From Ulf.Zibis at gmx.de  Mon Mar 22 14:34:57 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Mon, 22 Mar 2010 15:34:57 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003210916rad35d31wb0501f9bf960b07@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA543A0.2060600@gmx.de>	
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>	
	<4BA60A40.9050600@gmx.de>
	<1ccfd1c11003210916rad35d31wb0501f9bf960b07@mail.gmail.com>
Message-ID: <4BA78011.2070504@gmx.de>

Am 21.03.2010 17:16, schrieb Martin Buchholz:
> On Sun, Mar 21, 2010 at 05:00, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>
>>> On Sat, Mar 20, 2010 at 14:52, Ulf Zibis<Ulf.Zibis at gmx.de>    wrote:
>>>
>>>
>>>> - A little "bug" in javadoc:
>>>>   @exception ArrayIndexOutOfBoundsException
>>>>   instead    IndexOutOfBoundsException
>>>>
>>>>
>>> Not a bug.
>>>
>>>
>> Yes, but decreases the users capabilities catching exceptions more precise
>> and flexible.
>>
> There is a debate about whether to reuse existing exception classes
> or to throw class-specific subclasses.  IMO, IOOBE is a sufficiently expressive
> exception that I might have used just that, with expressive detail messages.
>

I'm with you. Especially StringIndexOutOfBoundsException appears as superfluous sugar to me. But we 
have it in the docs, so there is no way to get rid of it.
What do you think about to refactor most IOOBEs in String related classes to SIOOBEs? It would stay 
compatible to old Software, which still catches IOOBEs, but would look more straight, tidy and clean 
and fix the below mentioned bug.

-Ulf


> But that's only a consideration when designing new API or a new platform.
> Old API must stay unchanged, for compatibility.
>
>
>> Imagine, a method would throw an IndexOutOfBoundsException for some reason
>> and too calls Character.toChars(). The caller of such a method could
>> distinguish, where the exception would come from, and have separate catch
>> blocks. But if not documented ... :-(
>>
>> In extreme, following too would not be a bug in your sense:
>>   @exception Exception
>>
>> I became sensitive on this, as I have seen real bugs in
>> AbstractStringBuilder vice versa, where methods actually throw
>> IndexOutOfBoundsExceptions, but their javadoc states StringIndexOutOf
>> BoundsException.
>>
> Now that's a real bug.
>
> Martin
>
>
>


From Ulf.Zibis at gmx.de  Mon Mar 22 14:45:32 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Mon, 22 Mar 2010 15:45:32 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003211038p5681441cqf7ba500b9f8079e5@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA546C1.5040201@gmx.de> <4BA55149.5070603@gmx.de>	
	<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>	
	<4BA5F3F1.8080609@gmx.de>	
	<1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>	
	<4BA61C40.8060608@gmx.de> <4BA620A5.5020608@gmx.de>
	<1ccfd1c11003211038p5681441cqf7ba500b9f8079e5@mail.gmail.com>
Message-ID: <4BA7828C.6060109@gmx.de>

Am 21.03.2010 18:38, schrieb Martin Buchholz:
> On Sun, Mar 21, 2010 at 06:35, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>>>        
>>>> On Sun, Mar 21, 2010 at 03:24, Ulf Zibis<Ulf.Zibis at gmx.de>    wrote:
>>>>          
>>>>> Am 21.03.2010 09:05, schrieb Martin Buchholz:
>>>>>            
>>>>>> On Sat, Mar 20, 2010 at 15:50, Ulf Zibis<Ulf.Zibis at gmx.de>      wrote:
>>>>>>              
>>>>> I think, we should not define a distinct method for this once-used
>>>>> 3-liner:
>>>>>              for (; i<    max-1; i++)
>>>>>                  if (v[i] == high&&    v[i+1] == low)
>>>>>                          return i - offset;
>>>>>
>>>>> HotSpots resources should not be over-stressed to inline such things,
>>>>> having
>>>>> more reserves for more important things.
>>>>>            
>>>> On the contrary -
>>>> normally the above code snippet will rarely be executed,
>>>> and so will normally not be inlined into the caller,
>>>> which makes it easier for hotspot to inline
>>>> the caller into its caller.  Separate cold code into
>>>> separate methods.
>>>>          
>>> Thanks, I got the idea.
>>>
>>> But Isn't the push-call-pop-return overhead comparable with those 3 lines
>>> here, not to forget the repeated cache-3-values-once-more?
>>>        
> Even if I'm wrong, and this cold code is actually hot,
> I don't think there will be a big performance loss.
> The method call is outside the loop.
>    

What about at least reusing the cached values from calling method via 
indexOfSupplementary(ch, fromIndex, value, offset, max - 1) ?

>    
>> And additionally the slow rarely used branch would stay in stone, even if
>> after some time, the inline threshhold becomes reached, as JIT, AFAIK, can't
>> count the frequency of compiled code usage.
>>      
> It's certainly the intent that we will have multiple levels of
> compilation ("tiered compilation") and profiling would be
> enabled on at least some compiled code.
>    

That's an interesting option.

-Ulf


From Ulf.Zibis at gmx.de  Mon Mar 22 14:53:57 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Mon, 22 Mar 2010 15:53:57 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <4BA77FBA.9060308@univ-mlv.fr>
References: <4A95079A.8080803@gmx.de>		<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>		<4B9FE4DD.1090405@sun.com>		<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>		<4BA007A4.2030907@sun.com>	<4BA3F0B5.1070404@gmx.de>		<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>		<4BA564BB.9090901@gmx.de>		<1ccfd1c11003210020l1c200f89h95541d120fdc08cb@mail.gmail.com>		<4BA602F9.7000408@gmx.de>	<1ccfd1c11003210923q27e4d8bdj913350d8bca58195@mail.gmail.com>	<4BA7775A.2080506@gmx.de>
	<4BA77FBA.9060308@univ-mlv.fr>
Message-ID: <4BA78485.2020505@gmx.de>

Am 22.03.2010 15:33, schrieb R?mi Forax:
> Le 22/03/2010 14:57, Ulf Zibis a ?crit :
>>
>> Oh, on parle fran?ais. Je l'aime beaucoup.
>
> Totally off topic but

You're free to add some opinion to the main topic. ;-)

> You mean: "j'aime beaucoup".
> je l'aime beaucoup means I love him/her a lot.

... ou "le Fran?ais" ?
Well, I'm afraid I can't compete with a vrais Fran?ais.

-Ulf


From Ulf.Zibis at gmx.de  Mon Mar 22 15:29:44 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Mon, 22 Mar 2010 16:29:44 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA543A0.2060600@gmx.de>	
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
	<1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>
Message-ID: <4BA78CE8.9020107@gmx.de>

Am 21.03.2010 20:39, schrieb Martin Buchholz:
> On Sun, Mar 21, 2010 at 00:56, Martin Buchholz<martinrb at google.com>  wrote:
>
>    
>> Study also:
>> http://code.google.com/p/google-collections/source/browse/trunk/src/com/google/common/base/Preconditions.java
>>      
> Sorry, the best (most recent) version of Preconditions to study is here:
>
> http://code.google.com/p/guava-libraries/source/browse/trunk/src/com/google/common/base/Preconditions.java
>
> especially this comment:
>    

Thanks for the update. I'm not sure if I understand right the below 
comment. Does it mean, that inlining the message from a constant is less 
fast than from a call on badMsg()?

-Ulf

>    /*
>     * All recent hotspots (as of 2009) *really* like to have the natural code
>     *
>     * if (guardExpression) {
>     *    throw new BadException(messageExpression);
>     * }
>     *
>     * refactored so that messageExpression is moved to a separate
>     * String-returning method.
>     *
>     * if (guardExpression) {
>     *    throw new BadException(badMsg(...));
>     * }
>     *
>     * The alternative natural refactorings into void or Exception-returning
>     * methods are much slower.  This is a big deal - we're talking factors of
>     * 2-8 in microbenchmarks, not just 10-20%.  (This is a hotspot optimizer
>     * bug, which should be fixed, but that's a separate, big project).
>     *
>     * The coding pattern above is heavily used in java.util, e.g. in ArrayList.
>     * There is a RangeCheckMicroBenchmark in the JDK that was used to test this.
>     *
>     * But the methods in this class want to throw different exceptions,
>     * depending on the args, so it appears that this pattern is not directly
>     * applicable.  But we can use the ridiculous, devious trick of throwing an
>     * exception in the middle of the construction of another exception.
>     * Hotspot is fine with that.
>     */
>
>
> Martin
>
>
>    


From Xueming.Shen at Sun.COM  Tue Mar 23 20:32:42 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Tue, 23 Mar 2010 12:32:42 -0800
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
Message-ID: <4BA9256A.2020602@sun.com>

6937112: String.lastIndexOf confused by unpaired trailing surrogate

Kinda guess that it might bring us some performance benefit to separate 
the supplementary handling
code out into its own method (to help the not that smart hotspot:-)?), 
but doubt it is really something
worth doing. At  least you dont have to have the redundant 
value/offset=this.value/offset.

Seems like you started to attach the "final" keyword to all 
"constants"...guess it's a hint to help smart
vm for further optimization. Is the hotspot doing something special in 
simple case like below?

-Sherman

Martin Buchholz wrote:
> For a change, here's an actual plain old "incorrect result" bug fix
> for String.lastIndexOf
>
> Sherman, please file a bug and review.
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf/
>
> Also includes our usual performance-oriented fiddling.
>
> public class LastIndexOf {
>     public static void main(String[] args) {
>         int ch = 0x10042;
>         char[] bug = new char[3];
>         Character.toChars(ch, bug, 0);
>         bug[2] = bug[0];
>         System.out.println(new String(bug).lastIndexOf(ch));
>         bug[2] = '!';
>         System.out.println(new String(bug).lastIndexOf(ch));
>     }
> }
> ==> javac -source 1.6 -Xlint:all LastIndexOf.java
> ==> java -esa -ea LastIndexOf
> -1
> 0
>   


From Xueming.Shen at Sun.COM  Tue Mar 23 20:37:07 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Tue, 23 Mar 2010 12:37:07 -0800
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA9256A.2020602@sun.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com>
Message-ID: <4BA92673.3030200@sun.com>

CCed Masayoshi.

Masayoshi, Martin and Ulf are doing some "small" overhaul on those 
supplementary methods guess
you might be interested to review the change.

Martin, Ulf, please CC Masayoshi if you are touching the supplementary 
handling code.

-Sherman

Xueming Shen wrote:
> 6937112: String.lastIndexOf confused by unpaired trailing surrogate
>
> Kinda guess that it might bring us some performance benefit to 
> separate the supplementary handling
> code out into its own method (to help the not that smart hotspot:-)?), 
> but doubt it is really something
> worth doing. At  least you dont have to have the redundant 
> value/offset=this.value/offset.
>
> Seems like you started to attach the "final" keyword to all 
> "constants"...guess it's a hint to help smart
> vm for further optimization. Is the hotspot doing something special in 
> simple case like below?
>
> -Sherman
>
> Martin Buchholz wrote:
>> For a change, here's an actual plain old "incorrect result" bug fix
>> for String.lastIndexOf
>>
>> Sherman, please file a bug and review.
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf/
>>
>> Also includes our usual performance-oriented fiddling.
>>
>> public class LastIndexOf {
>>     public static void main(String[] args) {
>>         int ch = 0x10042;
>>         char[] bug = new char[3];
>>         Character.toChars(ch, bug, 0);
>>         bug[2] = bug[0];
>>         System.out.println(new String(bug).lastIndexOf(ch));
>>         bug[2] = '!';
>>         System.out.println(new String(bug).lastIndexOf(ch));
>>     }
>> }
>> ==> javac -source 1.6 -Xlint:all LastIndexOf.java
>> ==> java -esa -ea LastIndexOf
>> -1
>> 0
>>   
>
>


From martinrb at google.com  Mon Mar 22 20:05:19 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 22 Mar 2010 13:05:19 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA9256A.2020602@sun.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com>
Message-ID: <1ccfd1c11003221305j3bc6a0eahb11ad3c5ce544732@mail.gmail.com>

On Tue, Mar 23, 2010 at 13:32, Xueming Shen <Xueming.Shen at sun.com> wrote:
> 6937112: String.lastIndexOf confused by unpaired trailing surrogate
>
> Kinda guess that it might bring us some performance benefit to separate the
> supplementary handling
> code out into its own method (to help the not that smart hotspot:-)?), but
> doubt it is really something
> worth doing. At ?least you dont have to have the redundant
> value/offset=this.value/offset.

Yes, this is an "extreme" optimization, but one that is used
pervasively in java.util.concurrent (Doug Lea's influence) and
suitable for performance-critical methods.

It's only downside is the increase in size of source code.
(bytecode is actually smaller)

> Seems like you started to attach the "final" keyword to all
> "constants"...guess it's a hint to help smart
> vm for further optimization. Is the hotspot doing something special in
> simple case like below?

The "final" is there purely for software engineering reasons,
so that people don't make the common mistake of
modifying a field cached in a local, which would have no effect.

Martin

> -Sherman
>
> Martin Buchholz wrote:
>>
>> For a change, here's an actual plain old "incorrect result" bug fix
>> for String.lastIndexOf
>>
>> Sherman, please file a bug and review.
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf/
>>
>> Also includes our usual performance-oriented fiddling.
>>
>> public class LastIndexOf {
>> ? ?public static void main(String[] args) {
>> ? ? ? ?int ch = 0x10042;
>> ? ? ? ?char[] bug = new char[3];
>> ? ? ? ?Character.toChars(ch, bug, 0);
>> ? ? ? ?bug[2] = bug[0];
>> ? ? ? ?System.out.println(new String(bug).lastIndexOf(ch));
>> ? ? ? ?bug[2] = '!';
>> ? ? ? ?System.out.println(new String(bug).lastIndexOf(ch));
>> ? ?}
>> }
>> ==> javac -source 1.6 -Xlint:all LastIndexOf.java
>> ==> java -esa -ea LastIndexOf
>> -1
>> 0
>>
>
>


From martinrb at google.com  Mon Mar 22 20:08:31 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 22 Mar 2010 13:08:31 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA7828C.6060109@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA546C1.5040201@gmx.de> <4BA55149.5070603@gmx.de>
	<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>
	<4BA5F3F1.8080609@gmx.de>
	<1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>
	<4BA61C40.8060608@gmx.de> <4BA620A5.5020608@gmx.de>
	<1ccfd1c11003211038p5681441cqf7ba500b9f8079e5@mail.gmail.com>
	<4BA7828C.6060109@gmx.de>
Message-ID: <1ccfd1c11003221308q1644bbb5w38141fe55a66ba8e@mail.gmail.com>

On Mon, Mar 22, 2010 at 07:45, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 21.03.2010 18:38, schrieb Martin Buchholz:

>> Even if I'm wrong, and this cold code is actually hot,
>> I don't think there will be a big performance loss.
>> The method call is outside the loop.
>>
>
> What about at least reusing the cached values from calling method via
> indexOfSupplementary(ch, fromIndex, value, offset, max - 1) ?

No.  We're optimizing for BMP.

Martin


From martinrb at google.com  Mon Mar 22 22:03:03 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 22 Mar 2010 15:03:03 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA78011.2070504@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
	<4BA60A40.9050600@gmx.de>
	<1ccfd1c11003210916rad35d31wb0501f9bf960b07@mail.gmail.com>
	<4BA78011.2070504@gmx.de>
Message-ID: <1ccfd1c11003221503r46e6bb78g241e2b07ff7f1b3c@mail.gmail.com>

On Mon, Mar 22, 2010 at 07:34, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 21.03.2010 17:16, schrieb Martin Buchholz:

>> There is a debate about whether to reuse existing exception classes
>> or to throw class-specific subclasses. ?IMO, IOOBE is a sufficiently
>> expressive
>> exception that I might have used just that, with expressive detail
>> messages.
>>
>
> I'm with you. Especially StringIndexOutOfBoundsException appears as
> superfluous sugar to me. But we have it in the docs, so there is no way to
> get rid of it.
> What do you think about to refactor most IOOBEs in String related classes to
> SIOOBEs? It would stay compatible to old Software, which still catches
> IOOBEs, but would look more straight, tidy and clean and fix the below
> mentioned bug.

Every change is an incompatible change, with a risk/benefit tradeoff.

IMO there is no change to the exceptions thrown, or declared to be thrown,
or to their detail messages, in the string classes that is worth the risk
of incompatible change.
(with the exception of when the implementation contradicts the spec,
which is worth fixing)

Martin


From martinrb at google.com  Mon Mar 22 22:08:08 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 22 Mar 2010 15:08:08 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA78CE8.9020107@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
	<1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>
	<4BA78CE8.9020107@gmx.de>
Message-ID: <1ccfd1c11003221508n6180fd1dk862d47a2f27f42e2@mail.gmail.com>

On Mon, Mar 22, 2010 at 08:29, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 21.03.2010 20:39, schrieb Martin Buchholz:
>>
>> On Sun, Mar 21, 2010 at 00:56, Martin Buchholz<martinrb at google.com>
>> ?wrote:
>>
>>
>>>
>>> Study also:
>> http://code.google.com/p/guava-libraries/source/browse/trunk/src/com/google/common/base/Preconditions.java
>>
>> especially this comment:
>>
>
> Thanks for the update. I'm not sure if I understand right the below comment.
> Does it mean, that inlining the message from a constant is less fast than
> from a call on badMsg()?

I'm not sure I understand exactly, but as the comment says,
always make your error-checking code look like this:

   *
   * if (guardExpression) {
   *    throw new BadException(badMsg(...));
   * }
   *

although in String it's not so important because
there's no String concatenation, which is a notable cause
of cold bytecode bloat.

Martin


From martinrb at google.com  Mon Mar 22 22:27:37 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 22 Mar 2010 15:27:37 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BA78CE8.9020107@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B9FE4DD.1090405@sun.com>
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
	<1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>
	<4BA78CE8.9020107@gmx.de>
Message-ID: <1ccfd1c11003221527q29f61f7u700344a99d293ceb@mail.gmail.com>

Ulf,

I'd like to start a mq patch containing changes to
the String exception handling in the string classes.
Please provide me with a patch that uses the
blessed conventional names from Preconditions.java.

For the version that checks an offset and length for
containment within a larger sequence, I would prefer
the name "checkSubsequence", for example

private static void checkSubsequence(int start, int len, int size)

Please make sure that there are sufficient tests in
test/java/lang/String to ensure that you are not
inadvertently making changes to the exceptions thrown.

I note that test/java/lang/String/{Exceptions,Supplementary}
do try to test exception handling, but do not appear to
test for the *exact* class of the exception thrown,
nor the detail message of the exception.
When those tests were written, compatibility was less important.

Please adapt my
test/java/util/ArrayList/RangeCheckMicroBenchmark.java
to test string classes instead.
There is a good chance that you can demonstrate
a performance improvement on ordinary String operations!

Thanks,

Martin


From martinrb at google.com  Mon Mar 22 22:36:20 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 22 Mar 2010 15:36:20 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA92673.3030200@sun.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
Message-ID: <1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>

Masayoshi,

Ulf and I are working on a few changes to supplementary character handling.
Character.isSurrogate has already gone in.

The following are in the pipeline:

6934268: Better implementation of Character.isValidCodePoint
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isValidCodePoint
6934265: Add public method Character.isBMPCodePoint
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
[mq]: isBMPCodePoint2
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2
6937112: String.lastIndexOf confused by unpaired trailing surrogate
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf

In addition, Ulf and I would like to add
char Character.highSurrogate(int codePoint)
char Character.lowSurrogate(int codePoint)

Ulf,
please provide me with your latest patch for Character.highSurrogate
and I will add it to the pipeline.

Martin

On Tue, Mar 23, 2010 at 13:37, Xueming Shen <Xueming.Shen at sun.com> wrote:
> CCed Masayoshi.
>
> Masayoshi, Martin and Ulf are doing some "small" overhaul on those
> supplementary methods guess
> you might be interested to review the change.
>
> Martin, Ulf, please CC Masayoshi if you are touching the supplementary
> handling code.
>


From Ulf.Zibis at gmx.de  Mon Mar 22 23:02:03 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 00:02:03 +0100
Subject: Kinda ?
In-Reply-To: <4BA9256A.2020602@sun.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com>
Message-ID: <4BA7F6EB.6040804@gmx.de>

Can somebody betray the sense of "Kinda" to me?

-Ulf


From Paul.Hohensee at Sun.COM  Mon Mar 22 23:13:00 2010
From: Paul.Hohensee at Sun.COM (Paul Hohensee)
Date: Mon, 22 Mar 2010 19:13:00 -0400
Subject: Kinda ?
In-Reply-To: <4BA7F6EB.6040804@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA7F6EB.6040804@gmx.de>
Message-ID: <4BA7F97C.5070202@sun.com>

"in a way" plus "somewhat", as in "it's kinda bad" == "in a way, it's 
somewhat bad".

On 3/22/10 7:02 PM, Ulf Zibis wrote:
> Can somebody betray the sense of "Kinda" to me?
>
> -Ulf
>
>


From Ulf.Zibis at gmx.de  Mon Mar 22 23:14:21 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 00:14:21 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003221308q1644bbb5w38141fe55a66ba8e@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA546C1.5040201@gmx.de> <4BA55149.5070603@gmx.de>	
	<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>	
	<4BA5F3F1.8080609@gmx.de>	
	<1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>	
	<4BA61C40.8060608@gmx.de> <4BA620A5.5020608@gmx.de>	
	<1ccfd1c11003211038p5681441cqf7ba500b9f8079e5@mail.gmail.com>	
	<4BA7828C.6060109@gmx.de>
	<1ccfd1c11003221308q1644bbb5w38141fe55a66ba8e@mail.gmail.com>
Message-ID: <4BA7F9CD.7090601@gmx.de>

Am 22.03.2010 21:08, schrieb Martin Buchholz:
> On Mon, Mar 22, 2010 at 07:45, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 21.03.2010 18:38, schrieb Martin Buchholz:
>>      
>    
>>> Even if I'm wrong, and this cold code is actually hot,
>>> I don't think there will be a big performance loss.
>>> The method call is outside the loop.
>>>
>>>        
>> What about at least reusing the cached values from calling method via
>> indexOfSupplementary(ch, fromIndex, value, offset, max - 1) ?
>>      
> No.  We're optimizing for BMP.
>    

There would be no harm on BMP case speed. HotSpot wouldn't copy the 
values to stack if (1) ch is a BMP character and (2) 
indexOfSupplementary() becomes inlined.
I think Sherman is right in "dont have to have the redundant 
value/offset=this.value/offset".

-Ulf


From Ulf.Zibis at gmx.de  Mon Mar 22 23:17:42 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 00:17:42 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA9256A.2020602@sun.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com>
Message-ID: <4BA7FA96.7070102@gmx.de>

Sherman, can you have a look on your PC clock. I guess it's dis-adjusted.

-Ulf


From Xueming.Shen at Sun.COM  Mon Mar 22 23:39:20 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Mon, 22 Mar 2010 16:39:20 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA7FA96.7070102@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA7FA96.7070102@gmx.de>
Message-ID: <4BA7FFA8.7000302@sun.com>

Thanks Ulf! Fixed:-)

Ulf Zibis wrote:
> Sherman, can you have a look on your PC clock. I guess it's dis-adjusted.
>
> -Ulf
>
>


From weijun.wang at sun.com  Tue Mar 23 02:42:26 2010
From: weijun.wang at sun.com (weijun.wang at sun.com)
Date: Tue, 23 Mar 2010 02:42:26 +0000
Subject: hg: jdk7/tl/jdk: 6586707: NTLM authentication with proxy fails
Message-ID: <20100323024302.3CD6A4472E@hg.openjdk.java.net>

Changeset: 31dcf23042f9
Author:    weijun
Date:      2010-03-23 10:41 +0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/31dcf23042f9

6586707: NTLM authentication with proxy fails
Reviewed-by: chegar

! src/share/classes/sun/net/www/protocol/http/HttpURLConnection.java


From Ulf.Zibis at gmx.de  Tue Mar 23 11:34:51 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 12:34:51 +0100
Subject: Kinda ?
In-Reply-To: <4BA85AFA.70005@paradise.net.nz>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA7F6EB.6040804@gmx.de>
	<4BA7F97C.5070202@sun.com> <4BA85AFA.70005@paradise.net.nz>
Message-ID: <4BA8A75B.80904@gmx.de>

Much thanks for your kind answers.

I missed it on my beloved LEO 
<http://dict.leo.org/ende?lang=de&lp=ende&search=kinda>.

-Ulf


Am 23.03.2010 07:08, schrieb Bruce Chapman & Barbara Carey:
> Paul Hohensee wrote:
>> "in a way" plus "somewhat", as in "it's kinda bad" == "in a way, it's 
>> somewhat bad".
>>
>> On 3/22/10 7:02 PM, Ulf Zibis wrote:
>>> Can somebody betray the sense of "Kinda" to me?
>>>
>>> -Ulf
>>>
>>>
>>
> a spoken contraction of "kind of" (similar meaning to sorta a 
> contraction of sort-of)
>
> nothing to do with children (kinder) although you might sometimes see 
> it spelt that way too.
>
> Bruce
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100323/f48290a9/attachment.html>

From Ulf.Zibis at gmx.de  Tue Mar 23 12:22:29 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 13:22:29 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8E3DA3.7090902@gmx.de>	
	<1ccfd1c11003030806h45c16691p97961cb1003eba55@mail.gmail.com>	
	<4B8EB46C.1010208@sun.com> <4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
Message-ID: <4BA8B285.1040403@gmx.de>

Am 13.03.2010 00:04, schrieb Martin Buchholz:
>
>> Remembers me that some months ago I prepared a beautified version of
>> Character's source (things like above, replacing<code>  against {@code},
>> indentation inconsistencies etc.) Would there be interest to provide such a
>> patch ?
>>      
> Please provide URL of patch.
>
>    

All this work I had done here: 
https://bugs.openjdk.java.net/show_bug.cgi?id=100104

I suggest to start with patch "Cosmetics 1", and then go further.

Unfortunately the patches don't contain our latest bit twiddling, but I 
think, "Cosmetics 1" / "2" could be done first, and after we could 
include the bit twiddling.

Ulf


From Ulf.Zibis at gmx.de  Tue Mar 23 12:58:01 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 13:58:01 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA7F9CD.7090601@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>		<4BA546C1.5040201@gmx.de>
	<4BA55149.5070603@gmx.de>		<1ccfd1c11003210105s12220ffcrf139c2c59d33f84d@mail.gmail.com>		<4BA5F3F1.8080609@gmx.de>		<1ccfd1c11003210535l1505a0bfoe9f14c6ad4b07c42@mail.gmail.com>		<4BA61C40.8060608@gmx.de>
	<4BA620A5.5020608@gmx.de>		<1ccfd1c11003211038p5681441cqf7ba500b9f8079e5@mail.gmail.com>		<4BA7828C.6060109@gmx.de>	<1ccfd1c11003221308q1644bbb5w38141fe55a66ba8e@mail.gmail.com>
	<4BA7F9CD.7090601@gmx.de>
Message-ID: <4BA8BAD9.1000809@gmx.de>

Am 23.03.2010 00:14, schrieb Ulf Zibis:
> Am 22.03.2010 21:08, schrieb Martin Buchholz:
>> On Mon, Mar 22, 2010 at 07:45, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>>> Am 21.03.2010 18:38, schrieb Martin Buchholz:
>>>> Even if I'm wrong, and this cold code is actually hot,
>>>> I don't think there will be a big performance loss.
>>>> The method call is outside the loop.
>>>>
>>> What about at least reusing the cached values from calling method via
>>> indexOfSupplementary(ch, fromIndex, value, offset, max - 1) ?
>> No.  We're optimizing for BMP.
>
> There would be no harm on BMP case speed. HotSpot wouldn't copy the 
> values to stack if (1) ch is a BMP character and (2) 
> indexOfSupplementary() becomes inlined.
> I think Sherman is right in "dont have to have the redundant 
> value/offset=this.value/offset".

Additionally if indexOfSupplementary() would be static, transfer of this 
pointer would be saved.

-Ulf


From christopher.hegarty at sun.com  Tue Mar 23 13:57:59 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Tue, 23 Mar 2010 13:57:59 +0000
Subject: hg: jdk7/tl/jdk: 6614957: HttpsURLConnection not using the set
	SSLSocketFactory for creating all its Sockets; ...
Message-ID: <20100323135911.52071447CD@hg.openjdk.java.net>

Changeset: 8a9ebdc27045
Author:    chegar
Date:      2010-03-23 13:54 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/8a9ebdc27045

6614957: HttpsURLConnection not using the set SSLSocketFactory for creating all its Sockets
6771432: createSocket() - smpatch fails using 1.6.0_10 because of "Unconnected sockets not implemented"
6766775: X509 certificate hostname checking is broken in JDK1.6.0_10
Summary: All three bugs are interdependent
Reviewed-by: xuelei

! src/share/classes/javax/net/SocketFactory.java
! src/share/classes/sun/net/NetworkClient.java
! src/share/classes/sun/net/www/protocol/https/HttpsClient.java
! src/share/classes/sun/security/ssl/SSLSocketImpl.java
+ test/sun/security/ssl/sun/net/www/protocol/https/HttpsURLConnection/DNSIdentities.java
+ test/sun/security/ssl/sun/net/www/protocol/https/HttpsURLConnection/HttpsCreateSockTest.java
+ test/sun/security/ssl/sun/net/www/protocol/https/HttpsURLConnection/HttpsSocketFacTest.java
+ test/sun/security/ssl/sun/net/www/protocol/https/HttpsURLConnection/IPAddressDNSIdentities.java
+ test/sun/security/ssl/sun/net/www/protocol/https/HttpsURLConnection/IPAddressIPIdentities.java
+ test/sun/security/ssl/sun/net/www/protocol/https/HttpsURLConnection/IPIdentities.java
+ test/sun/security/ssl/sun/net/www/protocol/https/HttpsURLConnection/Identities.java


From Ulf.Zibis at gmx.de  Tue Mar 23 16:11:48 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 17:11:48 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
Message-ID: <4BA8E844.7080901@gmx.de>

Am 22.03.2010 23:36, schrieb Martin Buchholz:
> Masayoshi,
>
> Ulf and I are working on a few changes to supplementary character handling.
> Character.isSurrogate has already gone in.
>
> The following are in the pipeline:
>
> 6934268: Better implementation of Character.isValidCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isValidCodePoint
> 6934265: Add public method Character.isBMPCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
> [mq]: isBMPCodePoint2
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2
> 6937112: String.lastIndexOf confused by unpaired trailing surrogate
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf
>
> In addition, Ulf and I would like to add
> char Character.highSurrogate(int codePoint)
> char Character.lowSurrogate(int codePoint)
>
> Ulf,
> please provide me with your latest patch for Character.highSurrogate
> and I will add it to the pipeline.

Here it is.

I couldn't resist from some beautifying, and purging of 
sun.nio.cs.Surrogate.
Feel free to ignore it.

-Ulf

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Character_highLowSurrogate
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100323/a59d9704/Character_highLowSurrogate.ksh>

From Ulf.Zibis at gmx.de  Tue Mar 23 17:59:36 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 18:59:36 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA8E844.7080901@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>		<4BA9256A.2020602@sun.com>
	<4BA92673.3030200@sun.com>	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de>
Message-ID: <4BA90188.3090902@gmx.de>

Am 23.03.2010 17:11, schrieb Ulf Zibis:
> Am 22.03.2010 23:36, schrieb Martin Buchholz:
>> Masayoshi,
>>
>> Ulf and I are working on a few changes to supplementary character 
>> handling.
>> Character.isSurrogate has already gone in.
>>
>> The following are in the pipeline:
>>
>> 6934268: Better implementation of Character.isValidCodePoint
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isValidCodePoint
>> 6934265: Add public method Character.isBMPCodePoint
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint 
>>
>> [mq]: isBMPCodePoint2
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2
>> 6937112: String.lastIndexOf confused by unpaired trailing surrogate
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf
>>
>> In addition, Ulf and I would like to add
>> char Character.highSurrogate(int codePoint)
>> char Character.lowSurrogate(int codePoint)
>>
>> Ulf,
>> please provide me with your latest patch for Character.highSurrogate
>> and I will add it to the pipeline.
>
> Here it is.
>
> I couldn't resist from some beautifying, and purging of 
> sun.nio.cs.Surrogate.
> Feel free to ignore it.
>
> -Ulf

little correction

-Ulf

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Character_highLowSurrogate
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100323/169703c4/Character_highLowSurrogate.ksh>

From Ulf.Zibis at gmx.de  Tue Mar 23 18:17:39 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 19:17:39 +0100
Subject: Kinda ?
In-Reply-To: <4BA80022.3010907@oracle.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	<4BA9256A.2020602@sun.com>
	<4BA7F6EB.6040804@gmx.de> <4BA80022.3010907@oracle.com>
Message-ID: <4BA905C3.8080902@gmx.de>

Am 23.03.2010 00:41, schrieb David Holmes:
> Ulf Zibis said the following on 03/23/10 09:02:
>> Can somebody betray the sense of "Kinda" to me?
>
> PS. You really meant "convey the sense of" not "betray". :)

The typical trap from using dictionaries. Thanks.
I meant it in a kinda ironical sense for to break a secret.
"verraten" in German has those 2 meanings.

-Ulf


From martinrb at google.com  Tue Mar 23 18:19:11 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 23 Mar 2010 11:19:11 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA90188.3090902@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
Message-ID: <1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>

Ulf,

Please do not delete methods in Surrogate.java
(because we take compatibility seriously)
but instead gently denigrate them,
as I do below (added to my patch isBMPCodePoint2)

diff --git a/src/share/classes/sun/nio/cs/Surrogate.java
b/src/share/classes/sun/nio/cs/Surrogate.java
--- a/src/share/classes/sun/nio/cs/Surrogate.java
+++ b/src/share/classes/sun/nio/cs/Surrogate.java
@@ -77,6 +77,7 @@
     /**
      * Tells whether or not the given UCS-4 character must be represented as a
      * surrogate pair in UTF-16.
+     * Use of {@link Character#isSupplementaryCodePoint} is generally
preferred.
      */
     public static boolean neededFor(int uc) {
         return Character.isSupplementaryCodePoint(uc);
@@ -102,6 +103,7 @@

     /**
      * Converts the given surrogate pair into a 32-bit UCS-4 character.
+     * Use of {@link Character#toCodePoint} is generally preferred.
      */
     public static int toUCS4(char c, char d) {
         assert Character.isHighSurrogate(c) && Character.isLowSurrogate(d);


Martin


From Ulf.Zibis at gmx.de  Tue Mar 23 18:31:22 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 23 Mar 2010 19:31:22 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>	
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>	
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
Message-ID: <4BA908FA.9040107@gmx.de>

Ok, sorry and thanks.

Wouldn't "deprecated" be more noticeable?

What about using this message from compiler? :
warning: Surrogate is Sun proprietary API and may be removed in a future 
release.
@deprecated Public replacement is {@link Character#isSupplementaryCodePoint}

-Ulf


Am 23.03.2010 19:19, schrieb Martin Buchholz:
> Ulf,
>
> Please do not delete methods in Surrogate.java
> (because we take compatibility seriously)
> but instead gently denigrate them,
> as I do below (added to my patch isBMPCodePoint2)
>
> diff --git a/src/share/classes/sun/nio/cs/Surrogate.java
> b/src/share/classes/sun/nio/cs/Surrogate.java
> --- a/src/share/classes/sun/nio/cs/Surrogate.java
> +++ b/src/share/classes/sun/nio/cs/Surrogate.java
> @@ -77,6 +77,7 @@
>       /**
>        * Tells whether or not the given UCS-4 character must be represented as a
>        * surrogate pair in UTF-16.
> +     * Use of {@link Character#isSupplementaryCodePoint} is generally
> preferred.
>        */
>       public static boolean neededFor(int uc) {
>           return Character.isSupplementaryCodePoint(uc);
> @@ -102,6 +103,7 @@
>
>       /**
>        * Converts the given surrogate pair into a 32-bit UCS-4 character.
> +     * Use of {@link Character#toCodePoint} is generally preferred.
>        */
>       public static int toUCS4(char c, char d) {
>           assert Character.isHighSurrogate(c)&&  Character.isLowSurrogate(d);
>
>
> Martin
>
>
>    


From martinrb at google.com  Tue Mar 23 19:17:16 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 23 Mar 2010 12:17:16 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA908FA.9040107@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
	<4BA908FA.9040107@gmx.de>
Message-ID: <1ccfd1c11003231217p85ce189j63831ee31604f7cc@mail.gmail.com>

Deprecation of Surrogate methods is a reasonable choice.
Of course, users are not supposed to use Surrogate,
and they are regularly pestered by javac not to.

I rejected deprecation in favor of "denigration" because there is nothing
actually wrong with the existing methods except that they
are not standardized.  In particular, we would never dream
of removing them.  Surrogate has the very big advantage over the
new methods in Character of being compatible with prior JDK releases.

Deprecation is generally reserved for APIs that are actively harmful.

Martin

On Tue, Mar 23, 2010 at 11:31, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Ok, sorry and thanks.
>
> Wouldn't "deprecated" be more noticeable?
>
> What about using this message from compiler? :
> warning: Surrogate is Sun proprietary API and may be removed in a future
> release.
> @deprecated Public replacement is {@link Character#isSupplementaryCodePoint}


From martinrb at google.com  Tue Mar 23 23:50:20 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 23 Mar 2010 16:50:20 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA90188.3090902@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
Message-ID: <1ccfd1c11003231650g48dc0fb8gd445c46699433377@mail.gmail.com>

I've added another mini-patch to my patch set.

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint3

This deletes Surrogate.java, as Ulf wants,
except that ... it's another variant of Surrogate.java!
(which I didn't know existed)

Uses of Surrogate.neededFor are all now changed to
Character.isSupplementaryCodePoint, as suggested by Ulf.

I intend to fold all of the isBMPCodePoint patches together into one
before I commit them.

Ulf, please review.

Martin


From jonathan.gibbons at sun.com  Wed Mar 24 01:07:03 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Wed, 24 Mar 2010 01:07:03 +0000
Subject: hg: jdk7/tl/langtools: 6937244: sqe ws7 tools javap/javap_t10a fail
	jdk7 b80 used output of javap is changed
Message-ID: <20100324010711.722334408A@hg.openjdk.java.net>

Changeset: dd30de080cb9
Author:    jjg
Date:      2010-03-23 18:05 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/dd30de080cb9

6937244: sqe ws7 tools javap/javap_t10a fail jdk7 b80 used  output of javap is changed
Reviewed-by: darcy

! src/share/classes/com/sun/tools/javap/ClassWriter.java
+ test/tools/javap/6937244/T6937244.java
+ test/tools/javap/6937244/T6937244A.java


From martinrb at google.com  Tue Mar 23 22:50:09 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 23 Mar 2010 15:50:09 -0700
Subject: Concurrent calls to new Random() not random enough
Message-ID: <1ccfd1c11003231550x5bc5509dy3061513adafd0e40@mail.gmail.com>

Hi Sherman,

This is a bug report (sorry, no fix this time)

Synopsis: Concurrent calls to new Random() not random enough
Description:
new Random() promises this:
    /**
     * Creates a new random number generator. This constructor sets
     * the seed of the random number generator to a value very likely
     * to be distinct from any other invocation of this constructor.
     */

but if there are concurrent calls to new Random(), it does not
do very well at fulfilling its contract.

The following program should print out a number much closer to 0.

import java.util.*;

public class RandomSeed {
    public static void main(String[] args) throws Throwable {
        class RandomCollector implements Runnable {
            int[] randoms = new int[1<<21];
            public void run() {
                for (int i = 0; i < randoms.length; i++)
                    randoms[i] = new Random().nextInt();
            }};
        final int threadCount = 2;
        List<RandomCollector> collectors = new ArrayList<RandomCollector>();
        List<Thread> threads = new ArrayList<Thread>();
        for (int i = 0; i < threadCount; i++) {
            RandomCollector r = new RandomCollector();
            collectors.add(r);
            threads.add(new Thread(r));
        }
        for (Thread thread : threads)
            thread.start();
        for (Thread thread : threads)
            thread.join();
        int collisions = 0;
        HashSet<Integer> s = new HashSet<Integer>();
        for (RandomCollector r : collectors) {
            for (int x : r.randoms) {
                if (s.contains(x))
                    collisions++;
                s.add(x);
            }
        }
        System.out.println(collisions);
    }
}
---
==> javac -source 1.6 -Xlint:all RandomSeed.java
==> java -esa -ea RandomSeed
876


From martinrb at google.com  Tue Mar 23 22:59:29 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 23 Mar 2010 15:59:29 -0700
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4BA8B285.1040403@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B8EB46C.1010208@sun.com>
	<4B92C263.9020404@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
	<4BA8B285.1040403@gmx.de>
Message-ID: <1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>

On Tue, Mar 23, 2010 at 05:22, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 13.03.2010 00:04, schrieb Martin Buchholz:
>>
>>> Remembers me that some months ago I prepared a beautified version of
>>> Character's source (things like above, replacing<code> ?against {@code},
>>> indentation inconsistencies etc.) Would there be interest to provide such
>>> a
>>> patch ?

I support the plan of fixing coding style in core libraries when there is
consensus amongst developers, as there is with
<code> => @code
and
@exception => @throws

I think the right way to do this is to modify large portions of the
java libraries using a script.  The script should be checked into
the jdk repo as part of the fix.  There should be automated verification
that the generated javadoc is left unchanged.

There is precedent, for example the recent whitespace changes by Kelly,
and my own fixes to @since in jdk6.

To get you started, here is some elisp code that I have used when
making such changes on a file-level:

(defun tt-code ()
  (interactive)
  (query-replace-regexp "<\\(tt\\|code\\)>\\([^&<>\\\\]+\\)</\\1>"
"{@code \\2}"))

I suggest as a goal, modifying java.{lang,util,io,nio}

Martin


From i30817 at gmail.com  Wed Mar 24 01:26:50 2010
From: i30817 at gmail.com (Paulo Levi)
Date: Wed, 24 Mar 2010 01:26:50 +0000
Subject: Superpackages and final
Message-ID: <212322091003231826r53533954n2f644a2551b9f916@mail.gmail.com>

Do superpackages can mark exported classes as final?

What do i mean is can i export a class "as final" instead of marking it
final.
I ask because in some situations separating a unusual object capability into
a sub-type would be advantageous memory-wise, but the type has to be final,
because of backward compatibility or design.

I'm going to give a rather radical example from java.lang: String has a
substring capability and it uses two int fields (of 3 + a char array) to do
it, when it could return a private subclass of string on substring. It
doesn't, besides serialization complications, i guess because string is
designed not be extended for immutability concerns.

Problem is final has no granularity, and is all or nothing, namespace-wise.
Well, i asked the question, but i don't have much hope.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100324/154558f6/attachment.html>

From i30817 at gmail.com  Wed Mar 24 01:29:26 2010
From: i30817 at gmail.com (Paulo Levi)
Date: Wed, 24 Mar 2010 01:29:26 +0000
Subject: Superpackages and final
In-Reply-To: <212322091003231826r53533954n2f644a2551b9f916@mail.gmail.com>
References: <212322091003231826r53533954n2f644a2551b9f916@mail.gmail.com>
Message-ID: <212322091003231829g33a23f2bpb2ca5d2e7e6487c7@mail.gmail.com>

An alternative for java++ would be a immutable keyword instead of final.
Those classes would be only be able to be extended if they were also
immutable.

Type systems are so primitive still...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100324/b9147e1a/attachment.html>

From neal at gafter.com  Wed Mar 24 02:02:22 2010
From: neal at gafter.com (Neal Gafter)
Date: Tue, 23 Mar 2010 19:02:22 -0700
Subject: Superpackages and final
In-Reply-To: <212322091003231826r53533954n2f644a2551b9f916@mail.gmail.com>
References: <212322091003231826r53533954n2f644a2551b9f916@mail.gmail.com>
Message-ID: <15e8b9d21003231902y6b033671t812d88e2adf89b7@mail.gmail.com>

You could make the constructor module-private, so that the class can only be
extended within the module, and use public static factory methods to create
instances.  You don't need any special language support to do that.

On Tue, Mar 23, 2010 at 6:26 PM, Paulo Levi <i30817 at gmail.com> wrote:

> Do superpackages can mark exported classes as final?
>
> What do i mean is can i export a class "as final" instead of marking it
> final.
> I ask because in some situations separating a unusual object capability
> into a sub-type would be advantageous memory-wise, but the type has to be
> final, because of backward compatibility or design.
>
> I'm going to give a rather radical example from java.lang: String has a
> substring capability and it uses two int fields (of 3 + a char array) to do
> it, when it could return a private subclass of string on substring. It
> doesn't, besides serialization complications, i guess because string is
> designed not be extended for immutability concerns.
>
> Problem is final has no granularity, and is all or nothing, namespace-wise.
> Well, i asked the question, but i don't have much hope.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100323/78ad6c93/attachment.html>

From i30817 at gmail.com  Wed Mar 24 02:12:25 2010
From: i30817 at gmail.com (Paulo Levi)
Date: Wed, 24 Mar 2010 02:12:25 +0000
Subject: Superpackages and final
In-Reply-To: <15e8b9d21003231902y6b033671t812d88e2adf89b7@mail.gmail.com>
References: <212322091003231826r53533954n2f644a2551b9f916@mail.gmail.com> 
	<15e8b9d21003231902y6b033671t812d88e2adf89b7@mail.gmail.com>
Message-ID: <212322091003231912k42b8183dj776ae316162f7ca1@mail.gmail.com>

I see. Guess i didn't thought it out.
So make the constructor package private, but the type public is enough to
block extension...

Not very self documenting though.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100324/47363e66/attachment.html>

From martinrb at google.com  Wed Mar 24 02:17:20 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 23 Mar 2010 19:17:20 -0700
Subject: Concurrent calls to new Random() not random enough
In-Reply-To: <1ccfd1c11003231550x5bc5509dy3061513adafd0e40@mail.gmail.com>
References: <1ccfd1c11003231550x5bc5509dy3061513adafd0e40@mail.gmail.com>
Message-ID: <1ccfd1c11003231917j13b11b2dl46d3a801ecd05919@mail.gmail.com>

[+fy, jeremymanson]

Here's a much better test case,
and a proposed fix:

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/RandomSeedCollisions

This adds some initialization overhead, but also removes some
since
new Random()
no longer invokes a synchronized method.

----
import java.util.*;

public class RandomSeed {
    public static void main(String[] args) throws Throwable {
        class RandomCollector implements Runnable {
            long[] randoms = new long[1<<22];
            public void run() {
                for (int i = 0; i < randoms.length; i++)
                    randoms[i] = new Random().nextLong();
            }};
        final int threadCount = 2;
        List<RandomCollector> collectors = new ArrayList<RandomCollector>();
        List<Thread> threads = new ArrayList<Thread>();
        for (int i = 0; i < threadCount; i++) {
            RandomCollector r = new RandomCollector();
            collectors.add(r);
            threads.add(new Thread(r));
        }
        for (Thread thread : threads)
            thread.start();
        for (Thread thread : threads)
            thread.join();
        int collisions = 0;
        HashSet<Long> s = new HashSet<Long>();
        for (RandomCollector r : collectors) {
            for (long x : r.randoms) {
                if (s.contains(x))
                    collisions++;
                s.add(x);
            }
        }
        System.out.printf("collisions=%d%n", collisions);
    }
}


On Tue, Mar 23, 2010 at 15:50, Martin Buchholz <martinrb at google.com> wrote:
> Hi Sherman,
>
> This is a bug report (sorry, no fix this time)
>
> Synopsis: Concurrent calls to new Random() not random enough
> Description:
> new Random() promises this:
> ? ?/**
> ? ? * Creates a new random number generator. This constructor sets
> ? ? * the seed of the random number generator to a value very likely
> ? ? * to be distinct from any other invocation of this constructor.
> ? ? */
>
> but if there are concurrent calls to new Random(), it does not
> do very well at fulfilling its contract.
>
> The following program should print out a number much closer to 0.
>
> import java.util.*;
>
> public class RandomSeed {
> ? ?public static void main(String[] args) throws Throwable {
> ? ? ? ?class RandomCollector implements Runnable {
> ? ? ? ? ? ?int[] randoms = new int[1<<21];
> ? ? ? ? ? ?public void run() {
> ? ? ? ? ? ? ? ?for (int i = 0; i < randoms.length; i++)
> ? ? ? ? ? ? ? ? ? ?randoms[i] = new Random().nextInt();
> ? ? ? ? ? ?}};
> ? ? ? ?final int threadCount = 2;
> ? ? ? ?List<RandomCollector> collectors = new ArrayList<RandomCollector>();
> ? ? ? ?List<Thread> threads = new ArrayList<Thread>();
> ? ? ? ?for (int i = 0; i < threadCount; i++) {
> ? ? ? ? ? ?RandomCollector r = new RandomCollector();
> ? ? ? ? ? ?collectors.add(r);
> ? ? ? ? ? ?threads.add(new Thread(r));
> ? ? ? ?}
> ? ? ? ?for (Thread thread : threads)
> ? ? ? ? ? ?thread.start();
> ? ? ? ?for (Thread thread : threads)
> ? ? ? ? ? ?thread.join();
> ? ? ? ?int collisions = 0;
> ? ? ? ?HashSet<Integer> s = new HashSet<Integer>();
> ? ? ? ?for (RandomCollector r : collectors) {
> ? ? ? ? ? ?for (int x : r.randoms) {
> ? ? ? ? ? ? ? ?if (s.contains(x))
> ? ? ? ? ? ? ? ? ? ?collisions++;
> ? ? ? ? ? ? ? ?s.add(x);
> ? ? ? ? ? ?}
> ? ? ? ?}
> ? ? ? ?System.out.println(collisions);
> ? ?}
> }
> ---
> ==> javac -source 1.6 -Xlint:all RandomSeed.java
> ==> java -esa -ea RandomSeed
> 876
>


From daniel.daugherty at sun.com  Wed Mar 24 03:20:33 2010
From: daniel.daugherty at sun.com (daniel.daugherty at sun.com)
Date: Wed, 24 Mar 2010 03:20:33 +0000
Subject: hg: jdk7/tl/jdk: 6915365: 3/4 assert(false,
	"Unsupported VMGlobal Type") at management.cpp:1540
Message-ID: <20100324032046.21DBD440B1@hg.openjdk.java.net>

Changeset: f8c9a5e3f5db
Author:    dcubed
Date:      2010-03-23 19:03 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/f8c9a5e3f5db

6915365: 3/4 assert(false,"Unsupported VMGlobal Type") at management.cpp:1540
Summary: Remove exception throw to decouple JDK and HotSpot additions of known types.
Reviewed-by: mchung

! src/share/native/sun/management/Flag.c


From martinrb at google.com  Wed Mar 24 07:32:13 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 24 Mar 2010 00:32:13 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA90188.3090902@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
Message-ID: <1ccfd1c11003240032y77a6b77fi73b39ea698673860@mail.gmail.com>

Hi Ulf,

You have this interesting optimization:

     public static boolean isSurrogate(char ch) {
-        return ch >= MIN_SURROGATE && ch < MAX_SURROGATE + 1;
+        return (ch -= MIN_SURROGATE) >= 0 && ch < MAX_SURROGATE + 1 -
MIN_SURROGATE;
     }

Do you have any evidence that hotspot can produce better code from this,
or that there is a measurable performance improvement?
Or was this just an experiment?

Martin


From martinrb at google.com  Wed Mar 24 08:24:26 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 24 Mar 2010 01:24:26 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BA56749.8020506@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA56749.8020506@gmx.de>
Message-ID: <1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>

Ulf, Sherman, Masayoshi,
here are changes for you to review.
Only the patch highSurrogate needs a separate bug filed
(and CCC, please)

Ulf, I've made some progress on integrating your changes,
although almost all of them have been somewhat martinized:

Ulf-style tidying, mostly whitespace.
[mq]: Character-warnings2
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings2

Very minor optimizations.  Barely worth doing.
Note my removal of the need to have n++ inside the loop.
imported patch ulf-opto
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ulf-opto

Addition of highSurrogate and lowSurrogate
imported patch highSurrogate
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/highSurrogate

Martin

On Sat, Mar 20, 2010 at 17:24, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Sherman, please again consider about shifting Surrogate.high/low to
> Character.high/lowSurrogate.


From martinrb at google.com  Wed Mar 24 08:32:28 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 24 Mar 2010 01:32:28 -0700
Subject: hg: jdk7/tl/jdk: 6860431: Character.isSurrogate(char ch)
In-Reply-To: <1ccfd1c10909021329i34005b1bi5816e695d71a174d@mail.gmail.com>
References: <20090831221217.2CEFA12912@hg.openjdk.java.net>
	<4A9CDB81.1050500@gmx.de>
	<1ccfd1c10909012021g78d4fa3cx5f6ab0792c3ba688@mail.gmail.com>
	<4A9E27BF.8000905@gmx.de>
	<1ccfd1c10909020927v74fe5ceekc91f4e4a4724a273@mail.gmail.com>
	<4A9E9FE9.7060107@redhat.com>
	<1ccfd1c10909021003o7b060a23ge700680cd75b07bf@mail.gmail.com>
	<4A9EA759.3050804@redhat.com> <4A9ECBAC.7060303@gmx.de>
	<1ccfd1c10909021329i34005b1bi5816e695d71a174d@mail.gmail.com>
Message-ID: <1ccfd1c11003240132i35b9a24fldc8b4defb24364bb@mail.gmail.com>

Xueming,

I believe you still owe me a review and bug filed for
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/javadoc-unicode-escapes/

Martin

On Wed, Sep 2, 2009 at 13:29, Martin Buchholz <martinrb at google.com> wrote:
> On Wed, Sep 2, 2009 at 12:46, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>> Am 02.09.2009 19:11, David M. Lloyd schrieb:
>>>
>>> On 09/02/2009 12:03 PM, Martin Buchholz wrote:
>>>>
>>>> On Wed, Sep 2, 2009 at 09:40, David M. Lloyd <david.lloyd at redhat.com
>>>> <mailto:david.lloyd at redhat.com>> wrote:
>>>> ? ?Why not just do {@code \uD800}? ?I'm like 60% sure that would work
>>>> ? ?just fine. :-)
>>>>
>>>>
>>>> I'm pretty sure it would fail. ? Prove me wrong!
>>>> Searching the JDK sources for regex
>>>> ^ *\*.*\\u[0-9a-fA-F]{4}
>>>> is a good way to find javadoc bugs, e.g.
>>>> http://java.sun.com/javase/6/docs/api/java/lang/String.html#toLowerCase()
>>>
>>> Ah, you're right. ?It worked in my previewer but not in the actual
>>> javadoc. ?It's pretty bad that that sequence has special meaning but you
>>> can't escape a \ with another \. ?I guess in the worst case you could always
>>> do \u005CD800 or something like that.
>>>
>>
>> Looks little better, but not much. Did somebody tried it (Martin)?
>
> Well.... learn something new every day.
> Let's turn this into a fix.
> It's yet another "turkish i" bug.
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/javadoc-unicode-escapes/
>
> Xueming, please file a bug and review.
>
> Synopsis: Unreadable \uXXXX in javadoc
> Description: Replace \uXXXX by \u005CXXXX, or simply delete
>
> Martin
>
>> If it works in a previewer, is there any chance to change the javadoc spec,
>> staying backwards compatible?
>>
>> -Ulf
>>
>>
>>
>


From Ulf.Zibis at gmx.de  Wed Mar 24 17:20:17 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Wed, 24 Mar 2010 18:20:17 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B8EB46C.1010208@sun.com>	
	<4B92C263.9020404@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>	
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>	
	<4BA8B285.1040403@gmx.de>
	<1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>
Message-ID: <4BAA49D1.8010702@gmx.de>

Am 23.03.2010 23:59, schrieb Martin Buchholz:
> On Tue, Mar 23, 2010 at 05:22, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 13.03.2010 00:04, schrieb Martin Buchholz:
>>      
>>>        
>>>> Remembers me that some months ago I prepared a beautified version of
>>>> Character's source (things like above, replacing<code>    against {@code},
>>>> indentation inconsistencies etc.) Would there be interest to provide such
>>>> a
>>>> patch ?
>>>>          
> I support the plan of fixing coding style in core libraries when there is
> consensus amongst developers, as there is with
> <code>  =>  @code
> and
> @exception =>  @throws
>    

I too would like to see 8 spaces indentation on line breaks like:
     if (aaaaaaaaaaaaaaa > bbbbbbbbbbbbb &&
             ccccccccccccccc > ddddddddddddddddd)
         doSomething();

+ opening braces at line end instead beginning a new line

+ blank line between package ... and import ...

+ no blank line between javadoc and class/method declaration

+ 2 spaces after period

+ proper indentation in @param @return @throws blocks

+ not too much use of braces e.g. for 1-line blocks (one can see more 
code lines on same screen space)

+
      * @see    #forDigit(int, int)
      * @see    Integer#toString(int, int)
instead:
      * @see     java.lang.Character#forDigit(int, int)
      * @see     java.lang.Integer#toString(int, int)

+
          * range: U+DC00 through U+DFFF
instead
          * range: 0xDC00 through 0xDFFF

+
     {@link #isLowSurrogate(char)}
     {@link Character.UnicodeBlock}
instead
     {@linkplain #isLowSurrogate(char) isLowSurrogate}
<code>{@link Character.UnicodeBlock UnicodeBlock}</code>


> I think the right way to do this is to modify large portions of the
> java libraries using a script.  The script should be checked into
> the jdk repo as part of the fix.  There should be automated verification
> that the generated javadoc is left unchanged.
>
> There is precedent, for example the recent whitespace changes by Kelly,
> and my own fixes to @since in jdk6.
>
> To get you started, here is some elisp code that I have used when
> making such changes on a file-level:
>
> (defun tt-code ()
>    (interactive)
>    (query-replace-regexp "<\\(tt\\|code\\)>\\([^&<>\\\\]+\\)</\\1>"
> "{@code \\2}"))
>
> I suggest as a goal, modifying java.{lang,util,io,nio}
>    

That all sounds very good, so I should hold back my hand work.

-Ulf


From Xueming.Shen at Sun.COM  Wed Mar 24 17:22:47 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Wed, 24 Mar 2010 10:22:47 -0700
Subject: hg: jdk7/tl/jdk: 6860431: Character.isSurrogate(char ch)
In-Reply-To: <1ccfd1c11003240132i35b9a24fldc8b4defb24364bb@mail.gmail.com>
References: <20090831221217.2CEFA12912@hg.openjdk.java.net>
	<4A9CDB81.1050500@gmx.de>
	<1ccfd1c10909012021g78d4fa3cx5f6ab0792c3ba688@mail.gmail.com>
	<4A9E27BF.8000905@gmx.de>
	<1ccfd1c10909020927v74fe5ceekc91f4e4a4724a273@mail.gmail.com>
	<4A9E9FE9.7060107@redhat.com>
	<1ccfd1c10909021003o7b060a23ge700680cd75b07bf@mail.gmail.com>
	<4A9EA759.3050804@redhat.com> <4A9ECBAC.7060303@gmx.de>
	<1ccfd1c10909021329i34005b1bi5816e695d71a174d@mail.gmail.com>
	<1ccfd1c11003240132i35b9a24fldc8b4defb24364bb@mail.gmail.com>
Message-ID: <4BAA4A67.30802@sun.com>


CR 6937842 Created, P4 java/classes_lang Unreadable \uXXXX in javadoc

The change fine. But maybe it would be better to "escape" the \u20ac as 
well, instead of
simply deleting them. Not a big deal.

-Sherman

Martin Buchholz wrote:
> Xueming,
>
> I believe you still owe me a review and bug filed for
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/javadoc-unicode-escapes/
>
> Martin
>
> On Wed, Sep 2, 2009 at 13:29, Martin Buchholz <martinrb at google.com> wrote:
>   
>> On Wed, Sep 2, 2009 at 12:46, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>>     
>>> Am 02.09.2009 19:11, David M. Lloyd schrieb:
>>>       
>>>> On 09/02/2009 12:03 PM, Martin Buchholz wrote:
>>>>         
>>>>> On Wed, Sep 2, 2009 at 09:40, David M. Lloyd <david.lloyd at redhat.com
>>>>> <mailto:david.lloyd at redhat.com>> wrote:
>>>>>    Why not just do {@code \uD800}?  I'm like 60% sure that would work
>>>>>    just fine. :-)
>>>>>
>>>>>
>>>>> I'm pretty sure it would fail.   Prove me wrong!
>>>>> Searching the JDK sources for regex
>>>>> ^ *\*.*\\u[0-9a-fA-F]{4}
>>>>> is a good way to find javadoc bugs, e.g.
>>>>> http://java.sun.com/javase/6/docs/api/java/lang/String.html#toLowerCase()
>>>>>           
>>>> Ah, you're right.  It worked in my previewer but not in the actual
>>>> javadoc.  It's pretty bad that that sequence has special meaning but you
>>>> can't escape a \ with another \.  I guess in the worst case you could always
>>>> do \u005CD800 or something like that.
>>>>
>>>>         
>>> Looks little better, but not much. Did somebody tried it (Martin)?
>>>       
>> Well.... learn something new every day.
>> Let's turn this into a fix.
>> It's yet another "turkish i" bug.
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/javadoc-unicode-escapes/
>>
>> Xueming, please file a bug and review.
>>
>> Synopsis: Unreadable \uXXXX in javadoc
>> Description: Replace \uXXXX by \u005CXXXX, or simply delete
>>
>> Martin
>>
>>     
>>> If it works in a previewer, is there any chance to change the javadoc spec,
>>> staying backwards compatible?
>>>
>>> -Ulf
>>>
>>>
>>>
>>>       


From Xueming.Shen at Sun.COM  Wed Mar 24 17:42:10 2010
From: Xueming.Shen at Sun.COM (Xueming Shen)
Date: Wed, 24 Mar 2010 10:42:10 -0700
Subject: Concurrent calls to new Random() not random enough
In-Reply-To: <1ccfd1c11003231917j13b11b2dl46d3a801ecd05919@mail.gmail.com>
References: <1ccfd1c11003231550x5bc5509dy3061513adafd0e40@mail.gmail.com>
	<1ccfd1c11003231917j13b11b2dl46d3a801ecd05919@mail.gmail.com>
Message-ID: <4BAA4EF2.1080306@sun.com>

6937857: Concurrent calls to new Random() not random enough

Martin Buchholz wrote:
> [+fy, jeremymanson]
>
> Here's a much better test case,
> and a proposed fix:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/RandomSeedCollisions
>
> This adds some initialization overhead, but also removes some
> since
> new Random()
> no longer invokes a synchronized method.
>
> ----
> import java.util.*;
>
> public class RandomSeed {
>     public static void main(String[] args) throws Throwable {
>         class RandomCollector implements Runnable {
>             long[] randoms = new long[1<<22];
>             public void run() {
>                 for (int i = 0; i < randoms.length; i++)
>                     randoms[i] = new Random().nextLong();
>             }};
>         final int threadCount = 2;
>         List<RandomCollector> collectors = new ArrayList<RandomCollector>();
>         List<Thread> threads = new ArrayList<Thread>();
>         for (int i = 0; i < threadCount; i++) {
>             RandomCollector r = new RandomCollector();
>             collectors.add(r);
>             threads.add(new Thread(r));
>         }
>         for (Thread thread : threads)
>             thread.start();
>         for (Thread thread : threads)
>             thread.join();
>         int collisions = 0;
>         HashSet<Long> s = new HashSet<Long>();
>         for (RandomCollector r : collectors) {
>             for (long x : r.randoms) {
>                 if (s.contains(x))
>                     collisions++;
>                 s.add(x);
>             }
>         }
>         System.out.printf("collisions=%d%n", collisions);
>     }
> }
>
>
> On Tue, Mar 23, 2010 at 15:50, Martin Buchholz <martinrb at google.com> wrote:
>   
>> Hi Sherman,
>>
>> This is a bug report (sorry, no fix this time)
>>
>> Synopsis: Concurrent calls to new Random() not random enough
>> Description:
>> new Random() promises this:
>>    /**
>>     * Creates a new random number generator. This constructor sets
>>     * the seed of the random number generator to a value very likely
>>     * to be distinct from any other invocation of this constructor.
>>     */
>>
>> but if there are concurrent calls to new Random(), it does not
>> do very well at fulfilling its contract.
>>
>> The following program should print out a number much closer to 0.
>>
>> import java.util.*;
>>
>> public class RandomSeed {
>>    public static void main(String[] args) throws Throwable {
>>        class RandomCollector implements Runnable {
>>            int[] randoms = new int[1<<21];
>>            public void run() {
>>                for (int i = 0; i < randoms.length; i++)
>>                    randoms[i] = new Random().nextInt();
>>            }};
>>        final int threadCount = 2;
>>        List<RandomCollector> collectors = new ArrayList<RandomCollector>();
>>        List<Thread> threads = new ArrayList<Thread>();
>>        for (int i = 0; i < threadCount; i++) {
>>            RandomCollector r = new RandomCollector();
>>            collectors.add(r);
>>            threads.add(new Thread(r));
>>        }
>>        for (Thread thread : threads)
>>            thread.start();
>>        for (Thread thread : threads)
>>            thread.join();
>>        int collisions = 0;
>>        HashSet<Integer> s = new HashSet<Integer>();
>>        for (RandomCollector r : collectors) {
>>            for (int x : r.randoms) {
>>                if (s.contains(x))
>>                    collisions++;
>>                s.add(x);
>>            }
>>        }
>>        System.out.println(collisions);
>>    }
>> }
>> ---
>> ==> javac -source 1.6 -Xlint:all RandomSeed.java
>> ==> java -esa -ea RandomSeed
>> 876
>>
>>     


From martinrb at google.com  Wed Mar 24 18:48:27 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 24 Mar 2010 11:48:27 -0700
Subject: hg: jdk7/tl/jdk: 6860431: Character.isSurrogate(char ch)
In-Reply-To: <4BAA4A67.30802@sun.com>
References: <20090831221217.2CEFA12912@hg.openjdk.java.net>
	<4A9E27BF.8000905@gmx.de>
	<1ccfd1c10909020927v74fe5ceekc91f4e4a4724a273@mail.gmail.com>
	<4A9E9FE9.7060107@redhat.com>
	<1ccfd1c10909021003o7b060a23ge700680cd75b07bf@mail.gmail.com>
	<4A9EA759.3050804@redhat.com> <4A9ECBAC.7060303@gmx.de>
	<1ccfd1c10909021329i34005b1bi5816e695d71a174d@mail.gmail.com>
	<1ccfd1c11003240132i35b9a24fldc8b4defb24364bb@mail.gmail.com>
	<4BAA4A67.30802@sun.com>
Message-ID: <1ccfd1c11003241148g62745800o3f5f81b2a38e9215@mail.gmail.com>

On Wed, Mar 24, 2010 at 10:22, Xueming Shen <Xueming.Shen at sun.com> wrote:
>
> CR 6937842 Created, P4 java/classes_lang Unreadable \uXXXX in javadoc

Thanks.

> The change fine. But maybe it would be better to "escape" the \u20ac as
> well, instead of
> simply deleting them. Not a big deal.

I prefer to leave them out, because the example has nothing to do
with exotic characters.

Martin


From jonathan.gibbons at sun.com  Wed Mar 24 19:19:58 2010
From: jonathan.gibbons at sun.com (jonathan.gibbons at sun.com)
Date: Wed, 24 Mar 2010 19:19:58 +0000
Subject: hg: jdk7/tl/langtools: 6937318: jdk7 b86: javah and javah -help is no
	output for these commands
Message-ID: <20100324192003.81CA4441A5@hg.openjdk.java.net>

Changeset: 3058880c0b8d
Author:    jjg
Date:      2010-03-24 12:18 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/3058880c0b8d

6937318: jdk7 b86:  javah and javah -help is no output for these commands
Reviewed-by: darcy

! src/share/classes/com/sun/tools/javah/JavahTask.java
! test/tools/javah/T6893943.java


From martinrb at google.com  Wed Mar 24 19:34:13 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 24 Mar 2010 12:34:13 -0700
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4BAA49D1.8010702@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
	<4BA8B285.1040403@gmx.de>
	<1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>
	<4BAA49D1.8010702@gmx.de>
Message-ID: <1ccfd1c11003241234s5f7c4ec5l9570705d51892567@mail.gmail.com>

On Wed, Mar 24, 2010 at 10:20, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 23.03.2010 23:59, schrieb Martin Buchholz:

> I too would like to see 8 spaces indentation on line breaks like:
> ? ?if (aaaaaaaaaaaaaaa > bbbbbbbbbbbbb &&
> ? ? ? ? ? ?ccccccccccccccc > ddddddddddddddddd)
> ? ? ? ?doSomething();

This appears to be a new style (perhaps coming from the java IDEs?)
but it would be too pervasive a change for the JDK sources.

> + opening braces at line end instead beginning a new line

Perhaps too difficult/controversial?

> + blank line between package ... and import ...

This could be done, and automated.

>
> + no blank line between javadoc and class/method declaration

Yes.

> + 2 spaces after period

I agree with this style, but there is not enough consensus.

> + proper indentation in @param @return @throws blocks

Perhaps too difficult to automate?

> + not too much use of braces e.g. for 1-line blocks (one can see more code
> lines on same screen space)

I agree with this personally, but there is violent disagreement
in the java programmer community.  E.g. google's style guide
requires braces everywhere.

> +
> ? ? * @see ? ?#forDigit(int, int)
> ? ? * @see ? ?Integer#toString(int, int)
> instead:
> ? ? * @see ? ? java.lang.Character#forDigit(int, int)
> ? ? * @see ? ? java.lang.Integer#toString(int, int)

I did a global s/java\.lang\.// in Character.java.

> +
> ? ? ? ? * range: U+DC00 through U+DFFF
> instead
> ? ? ? ? * range: 0xDC00 through 0xDFFF

I disagree.  The U+ notation should be reserved for
Unicode characters (code points) and not UTF-16
code units (which surrogates are).

> +
> ? ?{@link #isLowSurrogate(char)}
> ? ?{@link Character.UnicodeBlock}
> instead
> ? ?{@linkplain #isLowSurrogate(char) isLowSurrogate}
> <code>{@link Character.UnicodeBlock UnicodeBlock}</code>

I've removed the <code> above.

Martin


From joe.darcy at sun.com  Thu Mar 25 00:03:57 2010
From: joe.darcy at sun.com (joe.darcy at sun.com)
Date: Thu, 25 Mar 2010 00:03:57 +0000
Subject: hg: jdk7/tl/langtools: 6937417: javac -Xprint returns
	IndexOutOfBoundsException
Message-ID: <20100325000401.C8A36441EB@hg.openjdk.java.net>

Changeset: 65e422bbb984
Author:    darcy
Date:      2010-03-24 17:02 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/65e422bbb984

6937417: javac -Xprint returns IndexOutOfBoundsException
Reviewed-by: jjg

! src/share/classes/com/sun/tools/javac/processing/PrintingProcessor.java
+ test/tools/javac/processing/model/util/elements/VacuousEnum.java


From weijun.wang at sun.com  Thu Mar 25 04:09:04 2010
From: weijun.wang at sun.com (weijun.wang at sun.com)
Date: Thu, 25 Mar 2010 04:09:04 +0000
Subject: hg: jdk7/tl/jdk: 6813340: X509Factory should not depend on
	is.available()==0
Message-ID: <20100325040917.7E9654422E@hg.openjdk.java.net>

Changeset: 26477628f2d5
Author:    weijun
Date:      2010-03-25 12:07 +0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/26477628f2d5

6813340: X509Factory should not depend on is.available()==0
Reviewed-by: xuelei

! src/share/classes/sun/security/provider/X509Factory.java
! src/share/classes/sun/security/tools/KeyTool.java
+ test/java/security/cert/CertificateFactory/ReturnStream.java
+ test/java/security/cert/CertificateFactory/SlowStream.java
+ test/java/security/cert/CertificateFactory/slowstream.sh


From christopher.hegarty at sun.com  Thu Mar 25 09:40:31 2010
From: christopher.hegarty at sun.com (christopher.hegarty at sun.com)
Date: Thu, 25 Mar 2010 09:40:31 +0000
Subject: hg: jdk7/tl/jdk: 6937703: java/net regression test issues with samevm
Message-ID: <20100325094056.833F744286@hg.openjdk.java.net>

Changeset: 6109b166bf68
Author:    chegar
Date:      2010-03-25 09:38 +0000
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/6109b166bf68

6937703: java/net regression test issues with samevm
Reviewed-by: alanb

! test/ProblemList.txt
! test/java/net/ProxySelector/B6737819.java
! test/java/net/ResponseCache/ResponseCacheTest.java
! test/java/net/ResponseCache/getResponseCode.java
! test/java/net/URL/TestIPv6Addresses.java
! test/java/net/URLClassLoader/HttpTest.java
! test/java/net/URLConnection/B5052093.java
! test/java/net/URLConnection/contentHandler/UserContentHandler.java


From Ulf.Zibis at gmx.de  Thu Mar 25 13:41:31 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 14:41:31 +0100
Subject: Review patches isBMPCodePoint/2/3
In-Reply-To: <1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>	
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>	
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
Message-ID: <4BAB680B.7050606@gmx.de>

Am 23.03.2010 19:19, schrieb Martin Buchholz:
> Ulf,
>
> Please do not delete methods in Surrogate.java
> (because we take compatibility seriously)
>    

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint

I still think, we should stick on Surrogate#isBMP for above 
compatibility reason.
Otherwise we too should rename #neededFor etc.
Please add @author Ulf Zibis and correct copyright date.

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2/src/share/classes/java/lang/AbstractStringBuilder.java.sdiff.html
Looks good, but please add @author Ulf Zibis and correct copyright date.
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2/src/share/classes/java/lang/Character.java.sdiff.html
Looks good, but please add @author Ulf Zibis and correct copyright date.
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2/src/share/classes/java/lang/String.java.sdiff.html
Looks good, but please add @author Ulf Zibis and correct copyright date.
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint2/src/share/classes/sun/nio/cs/Surrogate.java.sdiff.html
Looks good, but I still think we should use deprecated.
"deprecate" just means "don't use it if even possible", IMO not only for 
APIs that are actively harmful.
Imagine, users code relies on existing sun package API, because there 
was no appropriate public API in the past.
He is *used to ignore* the warning: Surrogate is Sun proprietary API and 
may be removed in a future release.
"Deprecated" will give him new attention, so it's likely, he will 
notice, that there are new API's since JDK 7.

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint3

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint3/test/java/nio/charset/coders/BashStreams.java.sdiff.html
         if (Character.isBMPCodePoint(c) && (c >= '\uFFFE'
                  || Character.isSurrogate((char))c)));
Please use 8-space-indentation for line continuation, following looks ugly:
  259                         if (Character.isHighSurrogate(c)
  260 && (cb.remaining() == 1)) {
  261                             cg.push(c);
  262                             break;
  263                         }


-Ulf

> but instead gently denigrate them,
> as I do below (added to my patch isBMPCodePoint2)
>
> diff --git a/src/share/classes/sun/nio/cs/Surrogate.java
> b/src/share/classes/sun/nio/cs/Surrogate.java
> --- a/src/share/classes/sun/nio/cs/Surrogate.java
> +++ b/src/share/classes/sun/nio/cs/Surrogate.java
> @@ -77,6 +77,7 @@
>       /**
>        * Tells whether or not the given UCS-4 character must be represented as a
>        * surrogate pair in UTF-16.
> +     * Use of {@link Character#isSupplementaryCodePoint} is generally
> preferred.
>        */
>       public static boolean neededFor(int uc) {
>           return Character.isSupplementaryCodePoint(uc);
> @@ -102,6 +103,7 @@
>
>       /**
>        * Converts the given surrogate pair into a 32-bit UCS-4 character.
> +     * Use of {@link Character#toCodePoint} is generally preferred.
>        */
>       public static int toUCS4(char c, char d) {
>           assert Character.isHighSurrogate(c)&&  Character.isLowSurrogate(d);
>
>
> Martin
>
>
>    


From Ulf.Zibis at gmx.de  Thu Mar 25 14:03:47 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 15:03:47 +0100
Subject: Review patches isBMPCodePoint/2/3
In-Reply-To: <4BAB680B.7050606@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>		<4BA9256A.2020602@sun.com>
	<4BA92673.3030200@sun.com>		<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>		<4BA8E844.7080901@gmx.de>
	<4BA90188.3090902@gmx.de>	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
	<4BAB680B.7050606@gmx.de>
Message-ID: <4BAB6D43.3010808@gmx.de>

Am 25.03.2010 14:41, schrieb Ulf Zibis:
> Please use 8-space-indentation for line continuation, following looks 
> ugly:

Oops, looked good in my TB edit window, but should be corrected:
  259                         if (Character.isHighSurrogate(c)
  260 && (cb.remaining() == 1)) {
  261                             cg.push(c);
  262                             break;
  263                         }


From Ulf.Zibis at gmx.de  Thu Mar 25 15:26:13 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 16:26:13 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003221503r46e6bb78g241e2b07ff7f1b3c@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA543A0.2060600@gmx.de>	
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>	
	<4BA60A40.9050600@gmx.de>	
	<1ccfd1c11003210916rad35d31wb0501f9bf960b07@mail.gmail.com>	
	<4BA78011.2070504@gmx.de>
	<1ccfd1c11003221503r46e6bb78g241e2b07ff7f1b3c@mail.gmail.com>
Message-ID: <4BAB8095.8030903@gmx.de>

Am 22.03.2010 23:03, schrieb Martin Buchholz:
> On Mon, Mar 22, 2010 at 07:34, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 21.03.2010 17:16, schrieb Martin Buchholz:
>>      
>    
>>> There is a debate about whether to reuse existing exception classes
>>> or to throw class-specific subclasses.  IMO, IOOBE is a sufficiently
>>> expressive
>>> exception that I might have used just that, with expressive detail
>>> messages.
>>>
>>>        
>> I'm with you. Especially StringIndexOutOfBoundsException appears as
>> superfluous sugar to me. But we have it in the docs, so there is no way to
>> get rid of it.
>> What do you think about to refactor most IOOBEs in String related classes to
>> SIOOBEs? It would stay compatible to old Software, which still catches
>> IOOBEs, but would look more straight, tidy and clean and fix the below
>> mentioned bug.
>>      
> Every change is an incompatible change, with a risk/benefit tradeoff.
>
> IMO there is no change to the exceptions thrown, or declared to be thrown,
> or to their detail messages, in the string classes that is worth the risk
> of incompatible change.
>    

Is somewhat reasonable, but what's the win of those "creative" 
variations on exception messages _and_ types in AbstractStringBuilder? :
throw new StringIndexOutOfBoundsException();
throw new StringIndexOutOfBoundsException(index);
throw new StringIndexOutOfBoundsException(start);
throw new StringIndexOutOfBoundsException("start > length()");
throw new StringIndexOutOfBoundsException("start > end");
throw new StringIndexOutOfBoundsException(end - start);
throw new StringIndexOutOfBoundsException(srcEnd);
throw new StringIndexOutOfBoundsException("srcBegin > srcEnd");
throw new IndexOutOfBoundsException();
throw new IndexOutOfBoundsException("start " + start + ", end " + end + 
", s.length() " + s.length());
throw new IndexOutOfBoundsException("dstOffset "+dstOffset);


> (with the exception of when the implementation contradicts the spec,
> which is worth fixing)
>    

#insert(int, char[], in, int), uses System.arraycopy().
If capacity doesn't suffice, it would throw an IOOBE, not SIOOBE

#insert(int, CharSequence) states:
      * @throws     IndexOutOfBoundsException  if the offset is invalid.
but (1) in fact throws SIOOBE in described case, if CharSequence is of 
String.
and (2) additionally throws IOOBE in case of capacity overflow, which is 
not mentioned.

#insert(...) methods mix between (int index, ...) and (int dstIndex, 
...) without any reason.

#substring(int) could be faster not using substring(int, int) detailed 
bounds checking.

#subSequence(int, int) in fact throws SIOOBE instead IOOBE.

#appendCodePoint(int) could throw AIOOBE, similar to many other append 
methods, capacity overflow behaviour is not documented.

I stop here ... ;-)

-Ulf


From Ulf.Zibis at gmx.de  Thu Mar 25 16:18:44 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 17:18:44 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003240032y77a6b77fi73b39ea698673860@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>	
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>	
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003240032y77a6b77fi73b39ea698673860@mail.gmail.com>
Message-ID: <4BAB8CE4.8020804@gmx.de>

Am 24.03.2010 08:32, schrieb Martin Buchholz:
> Hi Ulf,
>
> You have this interesting optimization:
>
>       public static boolean isSurrogate(char ch) {
> -        return ch>= MIN_SURROGATE&&  ch<  MAX_SURROGATE + 1;
> +        return (ch -= MIN_SURROGATE)>= 0&&  ch<  MAX_SURROGATE + 1 -
> MIN_SURROGATE;
>       }
>
> Do you have any evidence that hotspot can produce better code from this,
> or that there is a measurable performance improvement?
> Or was this just an experiment?
>    

If isHighSurrogate and isSurrogate are used consecutive on same char, 
result of ch -= MIN_SURROGATE could be used for both.
If isLowSurrogate and isSurrogate are used consecutive on same char, 
result of ch -= MAX_SURROGATE would fit better.
If isHighSurrogate and isLowSurrogate are used consecutive on same char, 
result of ch -= MIN_LOW_SURROGATE would fit better.

I suggest using 1st pair in JDK library.

-Ulf


From Ulf.Zibis at gmx.de  Thu Mar 25 17:19:06 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 18:19:06 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA56749.8020506@gmx.de>
	<1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
Message-ID: <4BAB9B0A.7030207@gmx.de>

Am 24.03.2010 09:24, schrieb Martin Buchholz:
> Ulf, Sherman, Masayoshi,
> here are changes for you to review.
> Only the patch highSurrogate needs a separate bug filed
> (and CCC, please)
>
> Ulf, I've made some progress on integrating your changes,
> although almost all of them have been somewhat martinized:
>
> Ulf-style tidying, mostly whitespace.
> [mq]: Character-warnings2
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings2
>    

I would prefer (better visibility of continued line):

public final  class Character
         implements java.io.Serializable, Comparable<Character>  {

I would prefer (indicates, that we are in current class):

     #isDigit(char)
instead
     Character#isDigit(char)
but indeed better than
     java.lang.Character#isDigit(char)

> Very minor optimizations.  Barely worth doing.
> Note my removal of the need to have n++ inside the loop.
>    

Overseen. Shame on me, as that's true Ulf-style. Yes, reduces 
in/decrements on rare supplementary cases.

> imported patch ulf-opto
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ulf-opto
>
> Addition of highSurrogate and lowSurrogate
> imported patch highSurrogate
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/highSurrogate
>    

Looks good. Interesting workaround on my "Note:"
I've reckoned with dropping my highSurrogate(char highCPWord, char 
lowCPWord).
Anyway I like to note, that I use that shortcut in my EUC_TW$Decoder 
twiddling. Following code:

             da[dp] = Character.highSurrogate(0x20000 + c);
results in (19 bytes):
   0x00b8ae27: add    $0x20000,%ecx      ;*iadd
                                         ; - 
sun.nio.cs.ext.D_21_d_narrow::decode at 98 (line 196)
   0x00b8ae2d: mov    %ecx,%ebp
   0x00b8ae2f: shr    $0xa,%ebp
   0x00b8ae32: add    $0xd7c0,%ebp       ;*isub
                                         ; - 
java.lang.Character::highSurrogate at 9 (line 3343)
                                         ; - 
sun.nio.cs.ext.D_21_d_narrow::decode at 99 (line 196)

             da[dp] = Character.highSurrogate((char)0x2, c);
results in (9 bytes):
   0x00b899e7: shr    $0xa,%ebp
   0x00b899ea: add    $0xd840,%ebp       ;*isub
                                         ; - 
java.lang.Character::highSurrogate at 14 (line 3365)
                                         ; - 
sun.nio.cs.ext.D_22_d_n_fastSurrogate::decode at 97 (line 196)


             dst.putInt(Character.highSurrogate((char)0x2, c)) << 16 | 
Character.lowSurrogate(c));
would additionally increase performance. I'm still preparing the 
benchmark + disassembly.

Those twiddling could be used in all surrogate processing charset 
coders, e.g. maybe true for UTF_x.
If public, would be too useful for developers coding charset coders for 
exotic charsets via java.nio.charset.spi.CharsetProvider

-Ulf


From Ulf.Zibis at gmx.de  Thu Mar 25 17:20:20 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 18:20:20 +0100
Subject: hg: jdk7/tl/jdk: 6860431: Character.isSurrogate(char ch)
In-Reply-To: <1ccfd1c11003241148g62745800o3f5f81b2a38e9215@mail.gmail.com>
References: <20090831221217.2CEFA12912@hg.openjdk.java.net>	
	<4A9E27BF.8000905@gmx.de>	
	<1ccfd1c10909020927v74fe5ceekc91f4e4a4724a273@mail.gmail.com>	
	<4A9E9FE9.7060107@redhat.com>	
	<1ccfd1c10909021003o7b060a23ge700680cd75b07bf@mail.gmail.com>	
	<4A9EA759.3050804@redhat.com> <4A9ECBAC.7060303@gmx.de>	
	<1ccfd1c10909021329i34005b1bi5816e695d71a174d@mail.gmail.com>	
	<1ccfd1c11003240132i35b9a24fldc8b4defb24364bb@mail.gmail.com>	
	<4BAA4A67.30802@sun.com>
	<1ccfd1c11003241148g62745800o3f5f81b2a38e9215@mail.gmail.com>
Message-ID: <4BAB9B54.7070102@gmx.de>

Am 24.03.2010 19:48, schrieb Martin Buchholz:
> On Wed, Mar 24, 2010 at 10:22, Xueming Shen<Xueming.Shen at sun.com>  wrote:
>    
>> CR 6937842 Created, P4 java/classes_lang Unreadable \uXXXX in javadoc
>>      
> Thanks.
>
>    
>> The change fine. But maybe it would be better to "escape" the \u20ac as
>> well, instead of
>> simply deleting them. Not a big deal.
>>      
> I prefer to leave them out, because the example has nothing to do
> with exotic characters.
>    

+1

-Ulf


From Ulf.Zibis at gmx.de  Thu Mar 25 20:26:26 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 21:26:26 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003241234s5f7c4ec5l9570705d51892567@mail.gmail.com>
References: <4A95079A.8080803@gmx.de>	
	<1ccfd1c11003091705k44447654wbdb311a48a1c7bb4@mail.gmail.com>	
	<4B97E3BD.2000901@sun.com> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>	
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>	
	<4BA8B285.1040403@gmx.de>	
	<1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>	
	<4BAA49D1.8010702@gmx.de>
	<1ccfd1c11003241234s5f7c4ec5l9570705d51892567@mail.gmail.com>
Message-ID: <4BABC6F2.70604@gmx.de>

Am 24.03.2010 20:34, schrieb Martin Buchholz:
> On Wed, Mar 24, 2010 at 10:20, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 23.03.2010 23:59, schrieb Martin Buchholz:
>>      
>    
>> I too would like to see 8 spaces indentation on line breaks like:
>>     if (aaaaaaaaaaaaaaa>  bbbbbbbbbbbbb&&
>>             ccccccccccccccc>  ddddddddddddddddd)
>>         doSomething();
>>      
> This appears to be a new style (perhaps coming from the java IDEs?)
>    

This rule is much older: 
http://java.sun.com/docs/codeconv/html/CodeConventions.doc3.html#248
But yes, I first saw this from NetBeans IDE formatting facility.

> but it would be too pervasive a change for the JDK sources.
>    

but wouldn't be a big deal for old stagers when coding new lines. ;-)

>    
>> + opening braces at line end instead beginning a new line
>>      
> Perhaps too difficult/controversial?
>    

Yes, you will know that better from your team.
See: http://java.sun.com/docs/codeconv/html/CodeConventions.doc10.html#182

>    
>> + blank line between package ... and import ...
>>      
> This could be done, and automated.
>
>    
>> + no blank line between javadoc and class/method declaration
>>      
> Yes.
>
>    
>> + 2 spaces after period
>>      
> I agree with this style, but there is not enough consensus.
>    

See comment for braces at line end.
See: http://java.sun.com/j2se/javadoc/writingdoccomments/index.html#examples
See: http://java.sun.com/j2se/javadoc/writingdoccomments/package-template

>    
>> + proper indentation in @param @return @throws blocks
>>      
> Perhaps too difficult to automate?
>    

I can understand. For the Character class I did it manually.
See: http://java.sun.com/j2se/javadoc/writingdoccomments/index.html#format
I guess we can ignore the 2nd column tabulator, especially if names 
become looooong

>    
>> + not too much use of braces e.g. for 1-line blocks (one can see more code
>> lines on same screen space)
>>      
> I agree with this personally, but there is violent disagreement
> in the java programmer community.  E.g. google's style guide
> requires braces everywhere.
>    

IMO, this should not be followed too bureaucratic. Think about 
labtop/netbook users or plenty open windows from IDE.
Additionally scrolling becomes a kinda nightmare.
And as you can see in the existing code base, things are never as bad as 
they seem, or the devil is not as black as he is painted. ;-)

>    
>> +
>>      * @see    #forDigit(int, int)
>>      * @see    Integer#toString(int, int)
>> instead:
>>      * @see     java.lang.Character#forDigit(int, int)
>>      * @see     java.lang.Integer#toString(int, int)
>>      
> I did a global s/java\.lang\.// in Character.java.
>    

As justified before, I would drop the current classes name.
See: http://java.sun.com/j2se/javadoc/writingdoccomments/index.html#tag

>    
>> +
>>          * range: U+DC00 through U+DFFF
>> instead
>>          * range: 0xDC00 through 0xDFFF
>>      
> I disagree.  The U+ notation should be reserved for
> Unicode characters (code points) and not UTF-16
> code units (which surrogates are).
>    

I fully agree, but in the context, where I wanted to change this, the 
matter actually was about code points, not code units, and ...
in case of Java char/UTF-16 code units, IMO we should use \u notation.
0x notation should only be used for none Unicode charsets binary values.

>    
>> +
>>     {@link #isLowSurrogate(char)}
>>     {@link Character.UnicodeBlock}
>> instead
>>     {@linkplain #isLowSurrogate(char) isLowSurrogate}
>> <code>{@link Character.UnicodeBlock UnicodeBlock}</code>
>>      
> I've removed the<code>  above.
>    

BTW, I can't find any docu about {@linkplain ...}.
What is the advantage against simple {@link ...}?

Additionally I like to mention for class Character:
- numerous javadoc blocks are only indented by 3 instead 4 spaces.
- some code lines are indented by 5 instead 4 spaces.
- I still dislike the space after a cast, refer to internal review ID of 
1740052.
- several UnicodeBlock declarations differ little in 
indentation/whitespace usage from the average. I would prefer:
         public static final UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_A =
             new UnicodeBlock("SUPPLEMENTARY_PRIVATE_USE_AREA_A",
                              new String[] { "Supplementary Private Use 
Area-A",
                                             
"SupplementaryPrivateUseArea-A" });


-Ulf


From Ulf.Zibis at gmx.de  Thu Mar 25 20:42:40 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 21:42:40 +0100
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA56749.8020506@gmx.de>
	<1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
Message-ID: <4BABCAC0.4010704@gmx.de>

Am 24.03.2010 09:24, schrieb Martin Buchholz:
> Ulf, Sherman, Masayoshi,
> here are changes for you to review.
> Only the patch highSurrogate needs a separate bug filed
> (and CCC, please)
>    

I had just filed it 2 weeks ago, see:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933322

-Ulf


From Ulf.Zibis at gmx.de  Thu Mar 25 20:52:42 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 21:52:42 +0100
Subject: review request 6933322 - Add methods highSurrogate(), lowSurrogate()
	to class Character
In-Reply-To: <4BABCAC0.4010704@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>		<4BA56749.8020506@gmx.de>	<1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
	<4BABCAC0.4010704@gmx.de>
Message-ID: <4BABCD1A.2040205@gmx.de>

Updated topic.

-Ulf


Am 25.03.2010 21:42, schrieb Ulf Zibis:
> Am 24.03.2010 09:24, schrieb Martin Buchholz:
>> Ulf, Sherman, Masayoshi,
>> here are changes for you to review.
>> Only the patch highSurrogate needs a separate bug filed
>> (and CCC, please)
>
> I had just filed it 2 weeks ago, see:
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933322
>
> -Ulf
>
>
>


From martinrb at google.com  Thu Mar 25 21:27:55 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 14:27:55 -0700
Subject: Review patches isBMPCodePoint/2/3
In-Reply-To: <4BAB680B.7050606@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
	<4BAB680B.7050606@gmx.de>
Message-ID: <1ccfd1c11003251427t37d024dfta9b951b4f8137c80@mail.gmail.com>

On Thu, Mar 25, 2010 at 06:41, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 23.03.2010 19:19, schrieb Martin Buchholz:
>>
>> Ulf,
>>
>> Please do not delete methods in Surrogate.java
>> (because we take compatibility seriously)
>>
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint
>
> I still think, we should stick on Surrogate#isBMP for above compatibility
> reason.
> Otherwise we too should rename #neededFor etc.

The difference is that isBMP was not provided with any officially supported JDK
(i.e. Sun JDK 6).  We take compatibility seriously, but not that seriously :)

> Please add @author Ulf Zibis

$ rg -l '@author.*Zibis'
./src/share/classes/java/lang/Character.java
./src/share/classes/java/lang/AbstractStringBuilder.java
./src/share/classes/java/lang/String.java
./src/share/classes/sun/nio/cs/Surrogate.java

> and correct copyright date.

I leave that up to the Sun release engineers.

Martin


From martinrb at google.com  Thu Mar 25 21:47:06 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 14:47:06 -0700
Subject: Review patches isBMPCodePoint/2/3
In-Reply-To: <4BAB680B.7050606@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
	<4BAB680B.7050606@gmx.de>
Message-ID: <1ccfd1c11003251447u697de858pbffc9db1a35cdb51@mail.gmail.com>

Here's another minor performance tweak to

    public String(int[] codePoints, int offset, int count) {

that optimizes for BMP.

        // Pass 1: Compute precise size of char[]
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBMPCodePoint(c))
                ;
            else if (Character.isSupplementaryCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/

Martin


From Ulf.Zibis at gmx.de  Thu Mar 25 22:19:34 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 23:19:34 +0100
Subject: Review patches isBMPCodePoint/2/3
In-Reply-To: <1ccfd1c11003251447u697de858pbffc9db1a35cdb51@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>	
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>	
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>	
	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>	
	<4BAB680B.7050606@gmx.de>
	<1ccfd1c11003251447u697de858pbffc9db1a35cdb51@mail.gmail.com>
Message-ID: <4BABE176.4030107@gmx.de>

Am 25.03.2010 22:47, schrieb Martin Buchholz:
> Here's another minor performance tweak to
>
>      public String(int[] codePoints, int offset, int count) {
>
> that optimizes for BMP.
>
>          // Pass 1: Compute precise size of char[]
>          int n = count;
>          for (int i = offset; i<  end; i++) {
>              int c = codePoints[i];
>              if (Character.isBMPCodePoint(c))
>                  ;
>              else if (Character.isSupplementaryCodePoint(c))
>                  n++;
>              else throw new IllegalArgumentException(Integer.toString(c));
>          }
>    

Yes, this is a valuable pattern, you found out.
I think, it could look smarter/more clear:

             if (Character.isBMPCodePoint(c))
                 continue;
             if (Character.isSupplementaryCodePoint(c))
                 n++;
             else
                 throw new IllegalArgumentException(Integer.toString(c));

And this would be faster, as isSupplementaryCodePoint is not optimized 
for following isBMPCodePoint:

             if (Character.isBMPCodePoint(c))
                 continue;
             if (!Character.isValidCodePoint(c))
                 throw new IllegalArgumentException(Integer.toString(c));
             n++;

Before you go to the meeting, maybe scan the JDK for similar use cases, 
before I get addicted too, and don't forget to define c as final.
It's enough, that I'm addicted from:
         // fill backwards for VM performance reasons, reduces register 
pressure, faster compare against 0
         for (int i = end; n > 0; ) {
             int c = codePoints[--i];
             if (Character.isBMPCodePoint(c))
                 v[--n] = (char)c;
             else
                 Character.toSurrogates(c, v, n-=2);
         }

-Ulf


> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint/
>
> Martin
>
>
>    


From Ulf.Zibis at gmx.de  Thu Mar 25 22:36:05 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Thu, 25 Mar 2010 23:36:05 +0100
Subject: Review patches isBMPCodePoint/2/3
In-Reply-To: <1ccfd1c11003251427t37d024dfta9b951b4f8137c80@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>	
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>	
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>	
	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>	
	<4BAB680B.7050606@gmx.de>
	<1ccfd1c11003251427t37d024dfta9b951b4f8137c80@mail.gmail.com>
Message-ID: <4BABE555.9070301@gmx.de>

Am 25.03.2010 22:27, schrieb Martin Buchholz:
> On Thu, Mar 25, 2010 at 06:41, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 23.03.2010 19:19, schrieb Martin Buchholz:
>>      
>>> Ulf,
>>>
>>> Please do not delete methods in Surrogate.java
>>> (because we take compatibility seriously)
>>>
>>>        
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint
>>
>> I still think, we should stick on Surrogate#isBMP for above compatibility
>> reason.
>> Otherwise we too should rename #neededFor etc.
>>      
> The difference is that isBMP was not provided with any officially supported JDK
> (i.e. Sun JDK 6).  We take compatibility seriously, but not that seriously :)
>    

You're right.
So we can make it private, and no one would come to the idea to use it 
in-advisedly.

> $ rg -l '@author.*Zibis'
> ./src/share/classes/java/lang/Character.java
> ./src/share/classes/java/lang/AbstractStringBuilder.java
> ./src/share/classes/java/lang/String.java
> ./src/share/classes/sun/nio/cs/Surrogate.java
>    

Thanks,

-Ulf


From martinrb at google.com  Thu Mar 25 23:33:44 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 16:33:44 -0700
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4BABC6F2.70604@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B99373E.40502@gmx.de>
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>
	<4B995D22.2020507@gmx.de>
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
	<4BA8B285.1040403@gmx.de>
	<1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>
	<4BAA49D1.8010702@gmx.de>
	<1ccfd1c11003241234s5f7c4ec5l9570705d51892567@mail.gmail.com>
	<4BABC6F2.70604@gmx.de>
Message-ID: <1ccfd1c11003251633j3f735662m23dde18b8973bb9@mail.gmail.com>

On Thu, Mar 25, 2010 at 13:26, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 24.03.2010 20:34, schrieb Martin Buchholz:
>>
>> On Wed, Mar 24, 2010 at 10:20, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>
>>>
>>> Am 23.03.2010 23:59, schrieb Martin Buchholz:
>>>
>>
>>
>>>
>>> I too would like to see 8 spaces indentation on line breaks like:
>>> ? ?if (aaaaaaaaaaaaaaa> ?bbbbbbbbbbbbb&&
>>> ? ? ? ? ? ?ccccccccccccccc> ?ddddddddddddddddd)
>>> ? ? ? ?doSomething();
>>>
>>
>> This appears to be a new style (perhaps coming from the java IDEs?)
>>
>
> This rule is much older:
> http://java.sun.com/docs/codeconv/html/CodeConventions.doc3.html#248
> But yes, I first saw this from NetBeans IDE formatting facility.

Ahhh, thank you very much for this history lesson.

I have manually adjusted some source files as you requested,
but systematically fixing this particular coding style bug
is likely to be difficult.

>>>
>>> +
>>> ? ? * @see ? ?#forDigit(int, int)
>>> ? ? * @see ? ?Integer#toString(int, int)
>>> instead:
>>> ? ? * @see ? ? java.lang.Character#forDigit(int, int)
>>> ? ? * @see ? ? java.lang.Integer#toString(int, int)
>>>
>>
>> I did a global s/java\.lang\.// in Character.java.
>>
>
> As justified before, I would drop the current classes name.
> See: http://java.sun.com/j2se/javadoc/writingdoccomments/index.html#tag

For this particular source file,
I am going to mildly disagree with you,
and keep as is.

>>> ? ? ? ? * range: U+DC00 through U+DFFF
>>> instead
>>> ? ? ? ? * range: 0xDC00 through 0xDFFF
>>>
>>
>> I disagree. ?The U+ notation should be reserved for
>> Unicode characters (code points) and not UTF-16
>> code units (which surrogates are).
>>
>
> I fully agree, but in the context, where I wanted to change this, the matter
> actually was about code points, not code units, and ...
> in case of Java char/UTF-16 code units, IMO we should use \u notation.
> 0x notation should only be used for none Unicode charsets binary values.

Oh, I see.  You are right.  Patch coming up.

> BTW, I can't find any docu about {@linkplain ...}.
> What is the advantage against simple {@link ...}?

http://java.sun.com/j2se/1.4.2/docs/tooldocs/javadoc/whatsnew-1.4.html

> Additionally I like to mention for class Character:
> - numerous javadoc blocks are only indented by 3 instead 4 spaces.

Addressed in one of my current patches.

> - several UnicodeBlock declarations differ little in indentation/whitespace
> usage from the average. I would prefer:
> ? ? ? ?public static final UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_A =
> ? ? ? ? ? ?new UnicodeBlock("SUPPLEMENTARY_PRIVATE_USE_AREA_A",
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? new String[] { "Supplementary Private Use
> Area-A",
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"SupplementaryPrivateUseArea-A"
> });

See forthcoming patch.


From martinrb at google.com  Thu Mar 25 23:37:08 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 16:37:08 -0700
Subject: Minor improvements to Character.UnicodeBlock
Message-ID: <1ccfd1c11003251637u6716d8efq5f8e4846ca07093@mail.gmail.com>

Hi Masayoshi and Ulf,

I'd like you to do a code review.

There are actual doc bugs in the specification
of the surrogate unicode blocks.

Ulf convinced me that we should use U+ notation for
Unicode blocks.

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/UnicodeBlock/

Martin


From martinrb at google.com  Thu Mar 25 23:47:19 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 16:47:19 -0700
Subject: Review patches isBMPCodePoint/2/3
In-Reply-To: <4BABE176.4030107@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231119g19d9e4d2x881322e9ed6c9b27@mail.gmail.com>
	<4BAB680B.7050606@gmx.de>
	<1ccfd1c11003251447u697de858pbffc9db1a35cdb51@mail.gmail.com>
	<4BABE176.4030107@gmx.de>
Message-ID: <1ccfd1c11003251647i89937b9u2faf43e67135507@mail.gmail.com>

On Thu, Mar 25, 2010 at 15:19, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 25.03.2010 22:47, schrieb Martin Buchholz:
>>
>> Here's another minor performance tweak to
>>
>> ? ? public String(int[] codePoints, int offset, int count) {
>>
>> that optimizes for BMP.
>>
>> ? ? ? ? // Pass 1: Compute precise size of char[]
>> ? ? ? ? int n = count;
>> ? ? ? ? for (int i = offset; i< ?end; i++) {
>> ? ? ? ? ? ? int c = codePoints[i];
>> ? ? ? ? ? ? if (Character.isBMPCodePoint(c))
>> ? ? ? ? ? ? ? ? ;
>> ? ? ? ? ? ? else if (Character.isSupplementaryCodePoint(c))
>> ? ? ? ? ? ? ? ? n++;
>> ? ? ? ? ? ? else throw new IllegalArgumentException(Integer.toString(c));
>> ? ? ? ? }
>>
>
> Yes, this is a valuable pattern, you found out.
> I think, it could look smarter/more clear:
>
> ? ? ? ? ? ?if (Character.isBMPCodePoint(c))
> ? ? ? ? ? ? ? ?continue;
> ? ? ? ? ? ?if (Character.isSupplementaryCodePoint(c))
> ? ? ? ? ? ? ? ?n++;
> ? ? ? ? ? ?else
> ? ? ? ? ? ? ? ?throw new IllegalArgumentException(Integer.toString(c));
>
> And this would be faster, as isSupplementaryCodePoint is not optimized for
> following isBMPCodePoint:
>
> ? ? ? ? ? ?if (Character.isBMPCodePoint(c))
> ? ? ? ? ? ? ? ?continue;
> ? ? ? ? ? ?if (!Character.isValidCodePoint(c))
> ? ? ? ? ? ? ? ?throw new IllegalArgumentException(Integer.toString(c));
> ? ? ? ? ? ?n++;

Done.

> Before you go to the meeting, maybe scan the JDK for similar use cases,

Sorry, that's your job.

> before I get addicted too, and don't forget to define c as final.

I see no reason to declare c as final here.

> It's enough, that I'm addicted from:
> ? ? ? ?// fill backwards for VM performance reasons, reduces register
> pressure, faster compare against 0
> ? ? ? ?for (int i = end; n > 0; ) {
> ? ? ? ? ? ?int c = codePoints[--i];
> ? ? ? ? ? ?if (Character.isBMPCodePoint(c))
> ? ? ? ? ? ? ? ?v[--n] = (char)c;
> ? ? ? ? ? ?else
> ? ? ? ? ? ? ? ?Character.toSurrogates(c, v, n-=2);
> ? ? ? ?}

Do you have actual evidence that this is faster?

I can see a different reason why - ???????????? traversal
is more cache-friendly.

http://en.wikipedia.org/wiki/Boustrophedon
Maybe those ancient Greeks were on to something.

Martin


From martinrb at google.com  Thu Mar 25 23:55:06 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 16:55:06 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BABCAC0.4010704@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA56749.8020506@gmx.de>
	<1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
	<4BABCAC0.4010704@gmx.de>
Message-ID: <1ccfd1c11003251655p3588c58cl48c598d044f11d00@mail.gmail.com>

On Thu, Mar 25, 2010 at 13:42, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 24.03.2010 09:24, schrieb Martin Buchholz:
>>
>> Ulf, Sherman, Masayoshi,
>> here are changes for you to review.
>> Only the patch highSurrogate needs a separate bug filed
>> (and CCC, please)
>>
>
> I had just filed it 2 weeks ago, see:
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6933322

Thank you very much.  Webrev adjusted.


From Ulf.Zibis at gmx.de  Thu Mar 25 23:55:57 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 26 Mar 2010 00:55:57 +0100
Subject: Character.ulf-opto
In-Reply-To: <4BAB9B0A.7030207@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>		<4BA56749.8020506@gmx.de>	<1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
	<4BAB9B0A.7030207@gmx.de>
Message-ID: <4BABF80D.7050105@gmx.de>

Am 25.03.2010 18:19, schrieb Ulf Zibis:
> Am 24.03.2010 09:24, schrieb Martin Buchholz:
>
>> Very minor optimizations.  Barely worth doing.
>> Note my removal of the need to have n++ inside the loop.
>
> Overseen. Shame on me, as that's true Ulf-style. Yes, reduces 
> in/decrements on rare supplementary cases.
>
>> imported patch ulf-opto
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ulf-opto

You didn't add my throws comments to offsetByCodePointsImpl and 
codePointCountImpl. Why?

-Ulf


From martinrb at google.com  Thu Mar 25 23:59:05 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 16:59:05 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BAB8CE4.8020804@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003240032y77a6b77fi73b39ea698673860@mail.gmail.com>
	<4BAB8CE4.8020804@gmx.de>
Message-ID: <1ccfd1c11003251659y48b1f0efk2de9271bedc991bc@mail.gmail.com>

On Thu, Mar 25, 2010 at 09:18, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 24.03.2010 08:32, schrieb Martin Buchholz:
>>
>> Hi Ulf,
>>
>> You have this interesting optimization:
>>
>> ? ? ?public static boolean isSurrogate(char ch) {
>> - ? ? ? ?return ch>= MIN_SURROGATE&& ?ch< ?MAX_SURROGATE + 1;
>> + ? ? ? ?return (ch -= MIN_SURROGATE)>= 0&& ?ch< ?MAX_SURROGATE + 1 -
>> MIN_SURROGATE;
>> ? ? ?}
>>
>> Do you have any evidence that hotspot can produce better code from this,
>> or that there is a measurable performance improvement?
>> Or was this just an experiment?
>>
>
> If isHighSurrogate and isSurrogate are used consecutive on same char, result
> of ch -= MIN_SURROGATE could be used for both.
> If isLowSurrogate and isSurrogate are used consecutive on same char, result
> of ch -= MAX_SURROGATE would fit better.
> If isHighSurrogate and isLowSurrogate are used consecutive on same char,
> result of ch -= MIN_LOW_SURROGATE would fit better.

It seems to me that you get the same opportunities for constant-folding.

Are you suggesting that there are x86 instructions that could be
more efficient if they have an argument value of MAX_SURROGATE-MIN_SURROGATE
than if they had an argument value of MAX_SURROGATE?

Martin

> I suggest using 1st pair in JDK library.
>
> -Ulf
>
>
>


From martinrb at google.com  Fri Mar 26 00:06:27 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 17:06:27 -0700
Subject: String.lastIndexOf confused by unpaired trailing surrogate
In-Reply-To: <4BAB9B0A.7030207@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA56749.8020506@gmx.de>
	<1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
	<4BAB9B0A.7030207@gmx.de>
Message-ID: <1ccfd1c11003251706o247368a2n281077554ae73a05@mail.gmail.com>

On Thu, Mar 25, 2010 at 10:19, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 24.03.2010 09:24, schrieb Martin Buchholz:

>> Addition of highSurrogate and lowSurrogate
>> imported patch highSurrogate
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/highSurrogate
>>
>
> Looks good. Interesting workaround on my "Note:"
> I've reckoned with dropping my highSurrogate(char highCPWord, char
> lowCPWord).

Yeah, it's not the kind of method that tends to become a public API.

If you can demonstrate a real performance advantage for highSurrogate(char,char)
beyond just EUC_TW, esp in UTF_8, then we can put it into Surrogate.java.

Martin

> Anyway I like to note, that I use that shortcut in my EUC_TW$Decoder
> twiddling. Following code:
>
> ? ? ? ? ? ?da[dp] = Character.highSurrogate(0x20000 + c);
> results in (19 bytes):
> ?0x00b8ae27: add ? ?$0x20000,%ecx ? ? ?;*iadd
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?; -
> sun.nio.cs.ext.D_21_d_narrow::decode at 98 (line 196)
> ?0x00b8ae2d: mov ? ?%ecx,%ebp
> ?0x00b8ae2f: shr ? ?$0xa,%ebp
> ?0x00b8ae32: add ? ?$0xd7c0,%ebp ? ? ? ;*isub
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?; -
> java.lang.Character::highSurrogate at 9 (line 3343)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?; -
> sun.nio.cs.ext.D_21_d_narrow::decode at 99 (line 196)
>
> ? ? ? ? ? ?da[dp] = Character.highSurrogate((char)0x2, c);
> results in (9 bytes):
> ?0x00b899e7: shr ? ?$0xa,%ebp
> ?0x00b899ea: add ? ?$0xd840,%ebp ? ? ? ;*isub
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?; -
> java.lang.Character::highSurrogate at 14 (line 3365)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?; -
> sun.nio.cs.ext.D_22_d_n_fastSurrogate::decode at 97 (line 196)
>
>
> ? ? ? ? ? ?dst.putInt(Character.highSurrogate((char)0x2, c)) << 16 |
> Character.lowSurrogate(c));
> would additionally increase performance. I'm still preparing the benchmark +
> disassembly.
>
> Those twiddling could be used in all surrogate processing charset coders,
> e.g. maybe true for UTF_x.
> If public, would be too useful for developers coding charset coders for
> exotic charsets via java.nio.charset.spi.CharsetProvider
>
> -Ulf
>
>
>


From martinrb at google.com  Fri Mar 26 00:55:15 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 17:55:15 -0700
Subject: Character.ulf-opto
In-Reply-To: <4BABF80D.7050105@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA56749.8020506@gmx.de>
	<1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>
	<4BAB9B0A.7030207@gmx.de> <4BABF80D.7050105@gmx.de>
Message-ID: <1ccfd1c11003251755l493fbfcdi5c7d2195db607b@mail.gmail.com>

On Thu, Mar 25, 2010 at 16:55, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 25.03.2010 18:19, schrieb Ulf Zibis:
>>
>> Am 24.03.2010 09:24, schrieb Martin Buchholz:
>>
>>> Very minor optimizations. ?Barely worth doing.
>>> Note my removal of the need to have n++ inside the loop.
>>
>> Overseen. Shame on me, as that's true Ulf-style. Yes, reduces
>> in/decrements on rare supplementary cases.

Actually, it optimizes for BMP characters, doesn't it?

>>> imported patch ulf-opto
>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ulf-opto
>
> You didn't add my throws comments to offsetByCodePointsImpl and
> codePointCountImpl. Why?

codePointCountImpl will never throw the way it is called now, I think.
offsetByCodePointsImpl throws explicitly, so a comment is not worthwhile.

Martin


From martinrb at google.com  Fri Mar 26 01:04:57 2010
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 25 Mar 2010 18:04:57 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BAB8095.8030903@gmx.de>
References: <4A95079A.8080803@gmx.de> <4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
	<4BA60A40.9050600@gmx.de>
	<1ccfd1c11003210916rad35d31wb0501f9bf960b07@mail.gmail.com>
	<4BA78011.2070504@gmx.de>
	<1ccfd1c11003221503r46e6bb78g241e2b07ff7f1b3c@mail.gmail.com>
	<4BAB8095.8030903@gmx.de>
Message-ID: <1ccfd1c11003251804m6c651cdapeb1cdf16487fcbc4@mail.gmail.com>

On Thu, Mar 25, 2010 at 08:26, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 22.03.2010 23:03, schrieb Martin Buchholz:
>>
>> On Mon, Mar 22, 2010 at 07:34, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>
>>>
>>> Am 21.03.2010 17:16, schrieb Martin Buchholz:
>>>
>>
>>
>>>>
>>>> There is a debate about whether to reuse existing exception classes
>>>> or to throw class-specific subclasses. ?IMO, IOOBE is a sufficiently
>>>> expressive
>>>> exception that I might have used just that, with expressive detail
>>>> messages.
>>>>
>>>>
>>>
>>> I'm with you. Especially StringIndexOutOfBoundsException appears as
>>> superfluous sugar to me. But we have it in the docs, so there is no way
>>> to
>>> get rid of it.
>>> What do you think about to refactor most IOOBEs in String related classes
>>> to
>>> SIOOBEs? It would stay compatible to old Software, which still catches
>>> IOOBEs, but would look more straight, tidy and clean and fix the below
>>> mentioned bug.
>>>
>>
>> Every change is an incompatible change, with a risk/benefit tradeoff.
>>
>> IMO there is no change to the exceptions thrown, or declared to be thrown,
>> or to their detail messages, in the string classes that is worth the risk
>> of incompatible change.
>>
>
> Is somewhat reasonable, but what's the win of those "creative" variations on
> exception messages _and_ types in AbstractStringBuilder? :
> throw new StringIndexOutOfBoundsException();
> throw new StringIndexOutOfBoundsException(index);
> throw new StringIndexOutOfBoundsException(start);
> throw new StringIndexOutOfBoundsException("start > length()");
> throw new StringIndexOutOfBoundsException("start > end");
> throw new StringIndexOutOfBoundsException(end - start);
> throw new StringIndexOutOfBoundsException(srcEnd);
> throw new StringIndexOutOfBoundsException("srcBegin > srcEnd");
> throw new IndexOutOfBoundsException();
> throw new IndexOutOfBoundsException("start " + start + ", end " + end + ",
> s.length() " + s.length());
> throw new IndexOutOfBoundsException("dstOffset "+dstOffset);

It's a sad situation.  It's certain someone is stupid enough
to have written a program that depends on the details above,
and the question is whether the improvement is worthwhile.
A measurable performance improvement will make your case
much stronger.

>> (with the exception of when the implementation contradicts the spec,
>> which is worth fixing)
>>
>
> #insert(int, char[], in, int), uses System.arraycopy().
> If capacity doesn't suffice, it would throw an IOOBE, not SIOOBE
>
> #insert(int, CharSequence) states:
> ? ? * @throws ? ? IndexOutOfBoundsException ?if the offset is invalid.
> but (1) in fact throws SIOOBE in described case, if CharSequence is of
> String.
> and (2) additionally throws IOOBE in case of capacity overflow, which is not
> mentioned.
>
> #insert(...) methods mix between (int index, ...) and (int dstIndex, ...)
> without any reason.
>
> #substring(int) could be faster not using substring(int, int) detailed
> bounds checking.
>
> #subSequence(int, int) in fact throws SIOOBE instead IOOBE.
>
> #appendCodePoint(int) could throw AIOOBE, similar to many other append
> methods, capacity overflow behaviour is not documented.
>
> I stop here ... ;-)

Several of us have been here,
wanting to improve these minor blemishes,
and eventually deciding to put our effort elsewhere.

I am in favor of at least fixing the detail messages and making the
argument names more regular, but I you'll have to get the support
of others as well.  Sherman?

Martin


From Ulf.Zibis at gmx.de  Fri Mar 26 17:56:06 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 26 Mar 2010 18:56:06 +0100
Subject: Minor improvements to Character.UnicodeBlock
In-Reply-To: <1ccfd1c11003251637u6716d8efq5f8e4846ca07093@mail.gmail.com>
References: <1ccfd1c11003251637u6716d8efq5f8e4846ca07093@mail.gmail.com>
Message-ID: <4BACF536.4040604@gmx.de>

Wow, that's indeed much better, than my simple whitespace correction.

Looks good. I guess the doc corrections are still in process.
My old patch may help to locate them:
https://bugs.openjdk.java.net/attachment.cgi?id=146

-Ulf


Am 26.03.2010 00:37, schrieb Martin Buchholz:
> Hi Masayoshi and Ulf,
>
> I'd like you to do a code review.
>
> There are actual doc bugs in the specification
> of the surrogate unicode blocks.
>
> Ulf convinced me that we should use U+ notation for
> Unicode blocks.
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/UnicodeBlock/
>
> Martin
>
>
>    


From Ulf.Zibis at gmx.de  Fri Mar 26 18:08:05 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 26 Mar 2010 19:08:05 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003251633j3f735662m23dde18b8973bb9@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>	
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>	
	<4BA8B285.1040403@gmx.de>	
	<1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>	
	<4BAA49D1.8010702@gmx.de>	
	<1ccfd1c11003241234s5f7c4ec5l9570705d51892567@mail.gmail.com>	
	<4BABC6F2.70604@gmx.de>
	<1ccfd1c11003251633j3f735662m23dde18b8973bb9@mail.gmail.com>
Message-ID: <4BACF805.2030004@gmx.de>

Am 26.03.2010 00:33, schrieb Martin Buchholz:
> On Thu, Mar 25, 2010 at 13:26, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>
>> - several UnicodeBlock declarations differ little in indentation/whitespace
>> usage from the average. I would prefer:
>>         public static final UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_A =
>>             new UnicodeBlock("SUPPLEMENTARY_PRIVATE_USE_AREA_A",
>>                              new String[] { "Supplementary Private Use
>> Area-A",
>>                                             "SupplementaryPrivateUseArea-A"
>> });
>>      
> See forthcoming patch.
>
>
>    

I've forgotten to mention: 
http://java.sun.com/docs/codeconv/html/CodeConventions.doc2.html#1852
Applies to location of static final int SIZE = 16;

-Ulf


From Ulf.Zibis at gmx.de  Fri Mar 26 18:36:20 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Fri, 26 Mar 2010 19:36:20 +0100
Subject: Purge Surrogate usages
In-Reply-To: <1ccfd1c11003231650g48dc0fb8gd445c46699433377@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>	
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>	
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231650g48dc0fb8gd445c46699433377@mail.gmail.com>
Message-ID: <4BACFEA4.2040502@gmx.de>

Am 24.03.2010 00:50, schrieb Martin Buchholz:
> I've added another mini-patch to my patch set.
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint3
>
> This deletes Surrogate.java, as Ulf wants,
> except that ... it's another variant of Surrogate.java!
> (which I didn't know existed)
>
> Uses of Surrogate.neededFor are all now changed to
> Character.isSupplementaryCodePoint, as suggested by Ulf.
>
> I intend to fold all of the isBMPCodePoint patches together into one
> before I commit them.
>
> Ulf, please review.
>    

Looking at my old patch:
https://bugs.openjdk.java.net/attachment.cgi?id=148&action=diff,
I'm afraid, that there are some remaining references to Surrogate class 
in the code base:
- cold imports
- static final constants
Can you declude them in your patch?

-Ulf


From martinrb at google.com  Fri Mar 26 23:18:13 2010
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 26 Mar 2010 16:18:13 -0700
Subject: Purge Surrogate usages
In-Reply-To: <4BACFEA4.2040502@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231650g48dc0fb8gd445c46699433377@mail.gmail.com>
	<4BACFEA4.2040502@gmx.de>
Message-ID: <1ccfd1c11003261618u20b631d3x5bc08de22d82a9f2@mail.gmail.com>

On Fri, Mar 26, 2010 at 11:36, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 24.03.2010 00:50, schrieb Martin Buchholz:
>>
>> I've added another mini-patch to my patch set.
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint3
>>
>> This deletes Surrogate.java, as Ulf wants,
>> except that ... it's another variant of Surrogate.java!
>> (which I didn't know existed)
>>
>> Uses of Surrogate.neededFor are all now changed to
>> Character.isSupplementaryCodePoint, as suggested by Ulf.
>>
>> I intend to fold all of the isBMPCodePoint patches together into one
>> before I commit them.
>>
>> Ulf, please review.
>>
>
> Looking at my old patch:
> https://bugs.openjdk.java.net/attachment.cgi?id=148&action=diff,
> I'm afraid, that there are some remaining references to Surrogate class in
> the code base:
> - cold imports
> - static final constants
> Can you declude them in your patch?


OK, just to make you happy, more Surrogate cleansing.
Two more mini-patches for you to review:

To be qfolded into public-isBMPCodePoint
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint4

to be qfolded into highSurrogate
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/highSurrogate2

Martin


From kelly.ohair at sun.com  Sat Mar 27 05:44:16 2010
From: kelly.ohair at sun.com (kelly.ohair at sun.com)
Date: Sat, 27 Mar 2010 05:44:16 +0000
Subject: hg: jdk7/tl/langtools: 6938326: Use of "ant -diagnostics" a problem
	with ant 1.8.0, exit code 1 now
Message-ID: <20100327054418.366324454D@hg.openjdk.java.net>

Changeset: de6375751eb7
Author:    ohair
Date:      2010-03-26 22:37 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/langtools/rev/de6375751eb7

6938326: Use of "ant -diagnostics" a problem with ant 1.8.0, exit code 1 now
Reviewed-by: jjg

! make/Makefile


From Ulf.Zibis at gmx.de  Sat Mar 27 09:48:47 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 27 Mar 2010 10:48:47 +0100
Subject: Purge Surrogate usages
In-Reply-To: <1ccfd1c11003261618u20b631d3x5bc08de22d82a9f2@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>	
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>	
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>	
	<1ccfd1c11003231650g48dc0fb8gd445c46699433377@mail.gmail.com>	
	<4BACFEA4.2040502@gmx.de>
	<1ccfd1c11003261618u20b631d3x5bc08de22d82a9f2@mail.gmail.com>
Message-ID: <4BADD47F.7090502@gmx.de>

Am 27.03.2010 00:18, schrieb Martin Buchholz:
> On Fri, Mar 26, 2010 at 11:36, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>>
>> Looking at my old patch:
>> https://bugs.openjdk.java.net/attachment.cgi?id=148&action=diff,
>> I'm afraid, that there are some remaining references to Surrogate class in
>> the code base:
>> - cold imports
>> - static final constants
>> Can you declude them in your patch?
>>      
>
> OK, just to make you happy, more Surrogate cleansing.
> Two more mini-patches for you to review:
>
> To be qfolded into public-isBMPCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isBMPCodePoint4
>    

Looks good.
Maybe you could rename isBMPCodePoint* to public-isBMPCodePoint*
I often mix, that isBMPCodePoint seems to be precedent of isBMPCodePoint*

> to be qfolded into highSurrogate
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/highSurrogate2
>    

You additionally could add:

      * Use of {@link Character#high/lowSurrogate} is generally preferred.

and propagate those methods to Character class.


-Ulf


From martinrb at google.com  Sat Mar 27 15:51:45 2010
From: martinrb at google.com (Martin Buchholz)
Date: Sat, 27 Mar 2010 08:51:45 -0700
Subject: Purge Surrogate usages
In-Reply-To: <4BADD47F.7090502@gmx.de>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>
	<1ccfd1c11003231650g48dc0fb8gd445c46699433377@mail.gmail.com>
	<4BACFEA4.2040502@gmx.de>
	<1ccfd1c11003261618u20b631d3x5bc08de22d82a9f2@mail.gmail.com>
	<4BADD47F.7090502@gmx.de>
Message-ID: <1ccfd1c11003270851k24dad2b6h73ce528d84348999@mail.gmail.com>

On Sat, Mar 27, 2010 at 02:48, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 27.03.2010 00:18, schrieb Martin Buchholz:

> You additionally could add:
>
> ? ? * Use of {@link Character#high/lowSurrogate} is generally preferred.
>
> and propagate those methods to Character class.

Thanks.  Done.

(I think "delegate" expresses your intent better than "propagate")

Martin


From Ulf.Zibis at gmx.de  Sat Mar 27 21:08:07 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 27 Mar 2010 22:08:07 +0100
Subject: review request for 6798511/6860431: Include functionality of
	Surrogate in Character
In-Reply-To: <1ccfd1c11003221527q29f61f7u700344a99d293ceb@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B9FE4DD.1090405@sun.com>	
	<1ccfd1c11003161409u923d21ya30acd8b104ee9ac@mail.gmail.com>	
	<4BA007A4.2030907@sun.com> <4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA543A0.2060600@gmx.de>	
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>	
	<1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>	
	<4BA78CE8.9020107@gmx.de>
	<1ccfd1c11003221527q29f61f7u700344a99d293ceb@mail.gmail.com>
Message-ID: <4BAE73B7.40101@gmx.de>

Am 22.03.2010 23:27, schrieb Martin Buchholz:
> Ulf,
>
> I'd like to start a mq patch containing changes to
> the String exception handling in the string classes.
> Please provide me with a patch that uses the
> blessed conventional names from Preconditions.java.
>    

Here are my first patches for start.
In the 2nd patch I did additional speed-ups, corrections and renamings.

Please review.

-Ulf


> For the version that checks an offset and length for
> containment within a larger sequence, I would prefer
> the name "checkSubsequence", for example
>
> private static void checkSubsequence(int start, int len, int size)
>
> Please make sure that there are sufficient tests in
> test/java/lang/String to ensure that you are not
> inadvertently making changes to the exceptions thrown.
>
> I note that test/java/lang/String/{Exceptions,Supplementary}
> do try to test exception handling, but do not appear to
> test for the *exact* class of the exception thrown,
> nor the detail message of the exception.
> When those tests were written, compatibility was less important.
>
> Please adapt my
> test/java/util/ArrayList/RangeCheckMicroBenchmark.java
> to test string classes instead.
> There is a good chance that you can demonstrate
> a performance improvement on ordinary String operations!
>
> Thanks,
>
> Martin
>
>
>    
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: String_Preconditions
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100327/97f5bee7/String_Preconditions.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: String_Preconditions2
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100327/97f5bee7/String_Preconditions2.ksh>

From Ulf.Zibis at gmx.de  Sat Mar 27 22:15:44 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 27 Mar 2010 23:15:44 +0100
Subject: Sponsor for 6666666: A better implementation of
	Character.isSupplementaryCodePoint
In-Reply-To: <1ccfd1c11003251633j3f735662m23dde18b8973bb9@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4B99373E.40502@gmx.de>	
	<1ccfd1c11003111138n3c666e91q60079121176ddd@mail.gmail.com>	
	<4B995D22.2020507@gmx.de>	
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>	
	<4BA8B285.1040403@gmx.de>	
	<1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>	
	<4BAA49D1.8010702@gmx.de>	
	<1ccfd1c11003241234s5f7c4ec5l9570705d51892567@mail.gmail.com>	
	<4BABC6F2.70604@gmx.de>
	<1ccfd1c11003251633j3f735662m23dde18b8973bb9@mail.gmail.com>
Message-ID: <4BAE838F.1040301@gmx.de>

Am 26.03.2010 00:33, schrieb Martin Buchholz:
> On Thu, Mar 25, 2010 at 13:26, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 24.03.2010 20:34, schrieb Martin Buchholz:
>>      
>>> On Wed, Mar 24, 2010 at 10:20, Ulf Zibis<Ulf.Zibis at gmx.de>    wrote:
>>>
>>>        
>>>> I too would like to see 8 spaces indentation on line breaks like:
>>>>     if (aaaaaaaaaaaaaaa>    bbbbbbbbbbbbb&&
>>>>             ccccccccccccccc>    ddddddddddddddddd)
>>>>         doSomething();
>>>>
>>>>          
>>> This appears to be a new style (perhaps coming from the java IDEs?)
>>>
>>>        
>> This rule is much older:
>> http://java.sun.com/docs/codeconv/html/CodeConventions.doc3.html#248
>> But yes, I first saw this from NetBeans IDE formatting facility.
>>      
> Ahhh, thank you very much for this history lesson.
>
> I have manually adjusted some source files as you requested,
> but systematically fixing this particular coding style bug
> is likely to be difficult.
>    

NetBeans IDE does a good job on that. Also those other formatting tasks 
maybe good addressed there.

-Ulf


From Ulf.Zibis at gmx.de  Sat Mar 27 22:37:57 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 27 Mar 2010 23:37:57 +0100
Subject: Character.ulf-opto
In-Reply-To: <1ccfd1c11003251755l493fbfcdi5c7d2195db607b@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA56749.8020506@gmx.de>	
	<1ccfd1c11003240124v24db88c0wd8c05396a92a6fef@mail.gmail.com>	
	<4BAB9B0A.7030207@gmx.de> <4BABF80D.7050105@gmx.de>
	<1ccfd1c11003251755l493fbfcdi5c7d2195db607b@mail.gmail.com>
Message-ID: <4BAE88C5.2040205@gmx.de>

Am 26.03.2010 01:55, schrieb Martin Buchholz:
> On Thu, Mar 25, 2010 at 16:55, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 25.03.2010 18:19, schrieb Ulf Zibis:
>>      
>>> Am 24.03.2010 09:24, schrieb Martin Buchholz:
>>>
>>>        
>>>> Very minor optimizations.  Barely worth doing.
>>>> Note my removal of the need to have n++ inside the loop.
>>>>          
>>> Overseen. Shame on me, as that's true Ulf-style. Yes, reduces
>>> in/decrements on rare supplementary cases.
>>>        
> Actually, it optimizes for BMP characters, doesn't it?
>    

Yes, of course.

>    
>>>> imported patch ulf-opto
>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ulf-opto
>>>>          
>> You didn't add my throws comments to offsetByCodePointsImpl and
>> codePointCountImpl. Why?
>>      
> codePointCountImpl will never throw the way it is called now, I think.
>    

Seems, you are right.

> offsetByCodePointsImpl throws explicitly, so a comment is not worthwhile.
>    

Well, those comment had been valuable on my research on possible 
exception doc bugs.

-Ulf


From Ulf.Zibis at gmx.de  Sat Mar 27 22:43:27 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Sat, 27 Mar 2010 23:43:27 +0100
Subject: Purge Surrogate usages
In-Reply-To: <1ccfd1c11003270851k24dad2b6h73ce528d84348999@mail.gmail.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>	
	<4BA9256A.2020602@sun.com> <4BA92673.3030200@sun.com>	
	<1ccfd1c11003221536k7fa58a39jbbf4dfcfa34b01c0@mail.gmail.com>	
	<4BA8E844.7080901@gmx.de> <4BA90188.3090902@gmx.de>	
	<1ccfd1c11003231650g48dc0fb8gd445c46699433377@mail.gmail.com>	
	<4BACFEA4.2040502@gmx.de>	
	<1ccfd1c11003261618u20b631d3x5bc08de22d82a9f2@mail.gmail.com>	
	<4BADD47F.7090502@gmx.de>
	<1ccfd1c11003270851k24dad2b6h73ce528d84348999@mail.gmail.com>
Message-ID: <4BAE8A0F.8090602@gmx.de>

Am 27.03.2010 16:51, schrieb Martin Buchholz:
> On Sat, Mar 27, 2010 at 02:48, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>
>> You additionally could add:
>>
>>      * Use of {@link Character#high/lowSurrogate} is generally preferred.
>>
>> and propagate those methods to Character class.
>>      
> Thanks.  Done.
>
> (I think "delegate" expresses your intent better than "propagate")

Thanks for your help in wording.

-Ulf


From kevin.l.stern at gmail.com  Sun Mar 28 11:55:12 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Sun, 28 Mar 2010 06:55:12 -0500
Subject: A List implementation backed by multiple small arrays rather than the
	traditional single large array.
Message-ID: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>

I put together the following class, ChunkedArrayList, in response to
Martin's request (excerpted from an earlier conversation on this web board)
below.

https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50

Thoughts?

Regards,

Kevin


On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
wrote:

    It surely is not a good idea to use a single backing array
    for huge arrays.  As you point out, it's up to 32GB
    for just one object.  But the core JDK
    doesn't offer a suitable alternative for users who need very
    large collections.

    It would have been more in the spirit of Java to have a
    collection class instead of ArrayList that was not fastest at
    any particular operation, but had excellent asymptotic behaviour,
    based on backing arrays containing backing arrays.
    But:
    - no such excellent class has been written yet
     (or please point me to such a class)
    - even if it were, such a best-of-breed-general-purpose
     List implementation would probably need to be introduced as a
     separate class, because of the performance expectations of
     existing implementations.

    In the meantime, we have to maintain what we got,
    and that includes living with arrays and classes that wrap them.

    Changing the spec is unlikely to succeed..

    Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100328/5ad15da0/attachment.html>

From kevin.l.stern at gmail.com  Sun Mar 28 12:28:22 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Sun, 28 Mar 2010 07:28:22 -0500
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
Message-ID: <1704b7a21003280528s64fe6f32hef68b45865bd3223@mail.gmail.com>

Please ignore the lack of custom serialization, I'll certainly tidy up the
code if there is interest in it.

On Sun, Mar 28, 2010 at 6:55 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:

> I put together the following class, ChunkedArrayList, in response to
> Martin's request (excerpted from an earlier conversation on this web board)
> below.
>
>
> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>
> Thoughts?
>
> Regards,
>
> Kevin
>
>
> On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
> wrote:
>
>     It surely is not a good idea to use a single backing array
>     for huge arrays.  As you point out, it's up to 32GB
>     for just one object.  But the core JDK
>     doesn't offer a suitable alternative for users who need very
>     large collections.
>
>     It would have been more in the spirit of Java to have a
>     collection class instead of ArrayList that was not fastest at
>     any particular operation, but had excellent asymptotic behaviour,
>     based on backing arrays containing backing arrays.
>     But:
>     - no such excellent class has been written yet
>      (or please point me to such a class)
>     - even if it were, such a best-of-breed-general-purpose
>      List implementation would probably need to be introduced as a
>      separate class, because of the performance expectations of
>      existing implementations.
>
>     In the meantime, we have to maintain what we got,
>     and that includes living with arrays and classes that wrap them.
>
>     Changing the spec is unlikely to succeed..
>
>     Martin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100328/648ff759/attachment.html>

From kevin.l.stern at gmail.com  Sun Mar 28 14:19:02 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Sun, 28 Mar 2010 09:19:02 -0500
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1704b7a21003280528s64fe6f32hef68b45865bd3223@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1704b7a21003280528s64fe6f32hef68b45865bd3223@mail.gmail.com>
Message-ID: <1704b7a21003280719h60a570b9te5b1296516f2ea29@mail.gmail.com>

Apologies, please use this link instead; this way you do not need to
download the file (it displays as a document).

https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en

Regards,

Kevin

On Sun, Mar 28, 2010 at 7:28 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:

> Please ignore the lack of custom serialization, I'll certainly tidy up the
> code if there is interest in it.
>
>
> On Sun, Mar 28, 2010 at 6:55 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:
>
>> I put together the following class, ChunkedArrayList, in response to
>> Martin's request (excerpted from an earlier conversation on this web board)
>> below.
>>
>>
>> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>>
>> Thoughts?
>>
>> Regards,
>>
>> Kevin
>>
>>
>> On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
>> wrote:
>>
>>     It surely is not a good idea to use a single backing array
>>     for huge arrays.  As you point out, it's up to 32GB
>>     for just one object.  But the core JDK
>>     doesn't offer a suitable alternative for users who need very
>>     large collections.
>>
>>     It would have been more in the spirit of Java to have a
>>     collection class instead of ArrayList that was not fastest at
>>     any particular operation, but had excellent asymptotic behaviour,
>>     based on backing arrays containing backing arrays.
>>     But:
>>     - no such excellent class has been written yet
>>      (or please point me to such a class)
>>     - even if it were, such a best-of-breed-general-purpose
>>      List implementation would probably need to be introduced as a
>>      separate class, because of the performance expectations of
>>      existing implementations.
>>
>>     In the meantime, we have to maintain what we got,
>>     and that includes living with arrays and classes that wrap them.
>>
>>     Changing the spec is unlikely to succeed..
>>
>>     Martin
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100328/48354369/attachment.html>

From xuelei.fan at sun.com  Mon Mar 29 05:51:14 2010
From: xuelei.fan at sun.com (xuelei.fan at sun.com)
Date: Mon, 29 Mar 2010 05:51:14 +0000
Subject: hg: jdk7/tl/jdk: 6693917: regression tests need to update for
	supporting ECC on solaris 11
Message-ID: <20100329055200.65A9D44814@hg.openjdk.java.net>

Changeset: 31517a0345d1
Author:    xuelei
Date:      2010-03-29 13:27 +0800
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/31517a0345d1

6693917: regression tests need to update for supporting ECC on solaris 11
Reviewed-by: weijun

! test/sun/security/ssl/etc/keystore
! test/sun/security/ssl/etc/truststore
! test/sun/security/ssl/sanity/ciphersuites/CheckCipherSuites.java


From martinrb at google.com  Mon Mar 29 07:23:35 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 29 Mar 2010 00:23:35 -0700
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
Message-ID: <1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>

On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> I put together the following class, ChunkedArrayList, in response to
> Martin's request (excerpted from an earlier conversation on this web board)
> below.
>
> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>
> Thoughts?

This class is well on the way to what I was thinking of,
but my bar for acceptance is a little higher.
In particular, I don't want to add yet another class
that is can replace some, but not all of existing
list implementations.

Most obviously, I don't want to lose the ability,
introduced in ArrayDeque, of having O(1) insertion
at the front and end of the collection.
Perhaps you can do this by having one "arraylet"
always be shared by both ends, which
grow towards each other in circular fashion.

I also think we should shrink the array when
necessary, so that occupancy never drops
below, say 50%.

Perhaps we should also have amortized O(1)
insertion in the middle by using a "gap array".
Probably more important for byte/char collections
like StringBuilder...

I believe there are more complicated implementations
that permit O(1) insertions at the ends, and only
O(sqrt(N)) space overhead.

....

E.g. Use your favorite search engine to do
some research on:
Resizable arrays in optimal time and space
Succinct dynamic data structures

Meta-comment: there is not enough transfer of
academic research results into practice; I would think this
is one of the responsibilities of the researchers.

I presume you'd be willing to sign a
contributor agreement to get your changes into
the JDK someday.

Martin

> Regards,
>
> Kevin
>
>
> On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
> wrote:
>
> ??? It surely is not a good idea to use a single backing array
> ??? for huge arrays.? As you point out, it's up to 32GB
> ??? for just one object.? But the core JDK
> ??? doesn't offer a suitable alternative for users who need very
> ??? large collections.
>
> ??? It would have been more in the spirit of Java to have a
> ??? collection class instead of ArrayList that was not fastest at
> ??? any particular operation, but had excellent asymptotic behaviour,
> ??? based on backing arrays containing backing arrays.
> ??? But:
> ??? - no such excellent class has been written yet
> ???? (or please point me to such a class)
> ??? - even if it were, such a best-of-breed-general-purpose
> ???? List implementation would probably need to be introduced as a
> ???? separate class, because of the performance expectations of
> ???? existing implementations.
>
> ??? In the meantime, we have to maintain what we got,
> ??? and that includes living with arrays and classes that wrap them.
>
> ??? Changing the spec is unlikely to succeed..
>
> ??? Martin
>


From opinali at gmail.com  Mon Mar 29 15:08:36 2010
From: opinali at gmail.com (Osvaldo Doederlein)
Date: Mon, 29 Mar 2010 12:08:36 -0300
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
Message-ID: <fb5ec5091003290808g4b1c809cr47625316cd1c54cf@mail.gmail.com>

Initially, it would be good enough to replace only java.util.ArrayList with
minimal overhead. ArrayList does not support efficient add-at-front or other
enhancements of ArrayDeque; but ArrayList is still a much more important and
popular collection, it's the primary "straight replacement for primitive
arrrays" and I guess it should continue with that role.

One problem of both ArrayList and primitive arrays is that they're not
GC-friendly; huge arrays suck for GC. IBM's realtime Metronome collector
uses the "arraylet" structure for primitive arrays, so there is a hard
upper-limit on object size (well, at least as long as apps don't define
classes with thousands of fields, I guess). This avoids the whole issue of
"large objects" which permits a simpler heap layout, better incremental GC,
etc. There are two tradeoffs. First, some overhead for all array operations
- but this is the least important, remarkably as the arraylet trick is
implement at the VM level so we can rely on the JIT to perform extra
optimizations (e.g., unrolling and other loop optimizations; bounds-check
elimination and other array opts, may be arraylet-aware so most overhead is
cancelled or at least lifted out of loop bodies and hot paths.) Second, no
support at all for huge arrays is incompatible with native code that expects
a continuous layout, e.g. for the byte[]s inside Images - so all these uses
must be identified and fixed somehow, e.g using DirectBuffers, or changing
the native layer to understand arraylets (image libaries may be OK with
banding), or in the worst case just copy the data to/from a continuous,
native array (in most cases I think this copy already happens for other
reasons, so there's no extra copy, just a slightly more expensive copy).

Now we're talking about some big VM change of course, but HotSpot would not
be the first production VM to do this so maybe it's a viable project for the
future, remarkably as Sun plans to keep raising the bar in
incremental/realtime GC (G1 may already be a great step forward, but huge
arrays will always spoil the fun for many apps).

In summary I think the ChunkedArrayList would serve only as a stopgap
solution, with extremely limited benefits unless it's sufficiently good so
like Martin says, we can replace more List implementations. And I'll even
add, replace many other collections too - e.g. a giant HashMap will contain
a giant Entry[] array inside it, I want this array to be chunked too
(ConcurrentHashMap already is, but it's tuned up differently, for concurrent
usage - and that's just one example anyway). And by "replace" I further mean
"change the implementation of all existing collections that are
array-backed", not "offer new collections" as the latter will only be
heavily used ten years from today when JavaSE7 is considered the minimum
JavaSE release to be supported by apps/libraries/frameworks/containers/etc.
Even then, the benefits will be clearly inferior to what can be achireved by
VM-level arraylets.

A+
Osvaldo


2010/3/29 Martin Buchholz <martinrb at google.com>

> On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
> > I put together the following class, ChunkedArrayList, in response to
> > Martin's request (excerpted from an earlier conversation on this web
> board)
> > below.
> >
> >
> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
> >
> > Thoughts?
>
> This class is well on the way to what I was thinking of,
> but my bar for acceptance is a little higher.
> In particular, I don't want to add yet another class
> that is can replace some, but not all of existing
> list implementations.
>
> Most obviously, I don't want to lose the ability,
> introduced in ArrayDeque, of having O(1) insertion
> at the front and end of the collection.
> Perhaps you can do this by having one "arraylet"
> always be shared by both ends, which
> grow towards each other in circular fashion.
>
> I also think we should shrink the array when
> necessary, so that occupancy never drops
> below, say 50%.
>
> Perhaps we should also have amortized O(1)
> insertion in the middle by using a "gap array".
> Probably more important for byte/char collections
> like StringBuilder...
>
> I believe there are more complicated implementations
> that permit O(1) insertions at the ends, and only
> O(sqrt(N)) space overhead.
>
> ....
>
> E.g. Use your favorite search engine to do
> some research on:
> Resizable arrays in optimal time and space
> Succinct dynamic data structures
>
> Meta-comment: there is not enough transfer of
> academic research results into practice; I would think this
> is one of the responsibilities of the researchers.
>
> I presume you'd be willing to sign a
> contributor agreement to get your changes into
> the JDK someday.
>
> Martin
>
> > Regards,
> >
> > Kevin
> >
> >
> > On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
> > wrote:
> >
> >     It surely is not a good idea to use a single backing array
> >     for huge arrays.  As you point out, it's up to 32GB
> >     for just one object.  But the core JDK
> >     doesn't offer a suitable alternative for users who need very
> >     large collections.
> >
> >     It would have been more in the spirit of Java to have a
> >     collection class instead of ArrayList that was not fastest at
> >     any particular operation, but had excellent asymptotic behaviour,
> >     based on backing arrays containing backing arrays.
> >     But:
> >     - no such excellent class has been written yet
> >      (or please point me to such a class)
> >     - even if it were, such a best-of-breed-general-purpose
> >      List implementation would probably need to be introduced as a
> >      separate class, because of the performance expectations of
> >      existing implementations.
> >
> >     In the meantime, we have to maintain what we got,
> >     and that includes living with arrays and classes that wrap them.
> >
> >     Changing the spec is unlikely to succeed..
> >
> >     Martin
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100329/7e5d5302/attachment.html>

From martinrb at google.com  Mon Mar 29 21:46:00 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 29 Mar 2010 14:46:00 -0700
Subject: Sponsor for 6666666: A better implementation of 
	Character.isSupplementaryCodePoint
In-Reply-To: <4BAE838F.1040301@gmx.de>
References: <4A95079A.8080803@gmx.de> <4B995D22.2020507@gmx.de>
	<1ccfd1c11003121504u5761c160t45513c98d3cec816@mail.gmail.com>
	<4BA8B285.1040403@gmx.de>
	<1ccfd1c11003231559x25aef975hca5b81e9dfe9b09c@mail.gmail.com>
	<4BAA49D1.8010702@gmx.de>
	<1ccfd1c11003241234s5f7c4ec5l9570705d51892567@mail.gmail.com>
	<4BABC6F2.70604@gmx.de>
	<1ccfd1c11003251633j3f735662m23dde18b8973bb9@mail.gmail.com>
	<4BAE838F.1040301@gmx.de>
Message-ID: <1ccfd1c11003291446j20a03c5at3becedb35dff707f@mail.gmail.com>

On Sat, Mar 27, 2010 at 15:15, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 26.03.2010 00:33, schrieb Martin Buchholz:
>>
>> On Thu, Mar 25, 2010 at 13:26, Ulf Zibis<Ulf.Zibis at gmx.de> ?wrote:
>>
>>>
>>> Am 24.03.2010 20:34, schrieb Martin Buchholz:
>>>
>>>>
>>>> On Wed, Mar 24, 2010 at 10:20, Ulf Zibis<Ulf.Zibis at gmx.de> ? ?wrote:
>>>>
>>>>
>>>>>
>>>>> I too would like to see 8 spaces indentation on line breaks like:
>>>>> ? ?if (aaaaaaaaaaaaaaa> ? ?bbbbbbbbbbbbb&&
>>>>> ? ? ? ? ? ?ccccccccccccccc> ? ?ddddddddddddddddd)
>>>>> ? ? ? ?doSomething();
>>>>>
>>>>>
>>>>
>>>> This appears to be a new style (perhaps coming from the java IDEs?)
>>>>
>>>>
>>>
>>> This rule is much older:
>>> http://java.sun.com/docs/codeconv/html/CodeConventions.doc3.html#248
>>> But yes, I first saw this from NetBeans IDE formatting facility.
>>>
>>
>> Ahhh, thank you very much for this history lesson.
>>
>> I have manually adjusted some source files as you requested,
>> but systematically fixing this particular coding style bug
>> is likely to be difficult.
>>
>
> NetBeans IDE does a good job on that. Also those other formatting tasks
> maybe good addressed there.

One of the standard counter-arguments to pervasive code cleaning changes
is the difficulty of merging, that other developers run into.
The standard counter-counter-argument to *that* is
"we provide an automated tool you can run over your own code
to eliminate merge conflicts", but that does actually require
an automated tool, and IDEs are typically not very scriptable.
So I generally greatly prefer changes that can be automated,
(typically a perl script in my usage)
and the automation tool can be checked in to the repo.

Martin


From martinrb at google.com  Mon Mar 29 22:17:22 2010
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 29 Mar 2010 15:17:22 -0700
Subject: review request for 6798511/6860431: Include functionality of 
	Surrogate in Character
In-Reply-To: <4BAE73B7.40101@gmx.de>
References: <4A95079A.8080803@gmx.de> <4BA007A4.2030907@sun.com>
	<4BA3F0B5.1070404@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
	<1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>
	<4BA78CE8.9020107@gmx.de>
	<1ccfd1c11003221527q29f61f7u700344a99d293ceb@mail.gmail.com>
	<4BAE73B7.40101@gmx.de>
Message-ID: <1ccfd1c11003291517l4a46f260s5c78639244da6420@mail.gmail.com>

Hi Ulf,

I will sponsor your initiative to refactor the exception handling.

Before this can go in, we should have just the exception handling
changes contained in one patch, since it is such a big change.

I'd like you to try to port my related RangeCheckMicroBenchmark
to string handling and hopefully demonstrate some measurable
performance improvement.

----

In the code below, I think some if's need to be changed to "else if"s.
(but don't just fix it - make sure we have a failing test with your
current code (you do run the regression tests religiously, right?))

+    static void checkPositionIndexes(int srcLen, int begin, int end) {
+        assert (srcLen >= 0);
+        int index;
+        if (begin < 0)
+            index = begin;
+        if (end > srcLen)
+            index = begin>srcLen ? begin:end-begin;
+        if (end < begin)
+            index = begin>srcLen ? begin : end<0 ? end : end-begin;
+        else
+            return;
+        throw new StringIndexOutOfBoundsException(index);

----
it's => its

+     *              following values are referred in it's message:

----

badIndex might be a better name for "index" below.

+        int index;
+        if (begin < 0)
+            index = begin;

----
Run at least the following tests
(below is how I test this code myself)

/home/martinrb/jct-tools/3.2.2_03/linux/bin/jtreg -v:nopass,fail
-vmoption:-enablesystemassertions -automatic "-k:\!ignore"
-testjdk:/usr/local/google/home/martin/ws/upstream/build/linux-amd64
test/sun/nio/cs test/java/nio/charset test/java/lang/StringCoding
test/java/lang/StringBuilder test/java/lang/StringBuffer
test/java/lang/String test/java/lang/Appendable

----

I think returning len below is too confusing.
Just make the return type void.

+    int checkPositionIndex(int index) {
+        int len = count; // not sure, if JIT recognizes that it's final ?
+        checkPositionIndex(len, index);
+        return len;
+    }

----

We will need a significant merge once I commit
related changes.

----

Thanks,

Martin


On Sat, Mar 27, 2010 at 14:08, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 22.03.2010 23:27, schrieb Martin Buchholz:
>>
>> Ulf,
>>
>> I'd like to start a mq patch containing changes to
>> the String exception handling in the string classes.
>> Please provide me with a patch that uses the
>> blessed conventional names from Preconditions.java.
>>
>
> Here are my first patches for start.
> In the 2nd patch I did additional speed-ups, corrections and renamings.
>
> Please review.
>
> -Ulf
>
>
>> For the version that checks an offset and length for
>> containment within a larger sequence, I would prefer
>> the name "checkSubsequence", for example
>>
>> private static void checkSubsequence(int start, int len, int size)
>>
>> Please make sure that there are sufficient tests in
>> test/java/lang/String to ensure that you are not
>> inadvertently making changes to the exceptions thrown.
>>
>> I note that test/java/lang/String/{Exceptions,Supplementary}
>> do try to test exception handling, but do not appear to
>> test for the *exact* class of the exception thrown,
>> nor the detail message of the exception.
>> When those tests were written, compatibility was less important.
>>
>> Please adapt my
>> test/java/util/ArrayList/RangeCheckMicroBenchmark.java
>> to test string classes instead.
>> There is a good chance that you can demonstrate
>> a performance improvement on ordinary String operations!
>>
>> Thanks,
>>
>> Martin
>>
>>
>>
>


From kevin.l.stern at gmail.com  Mon Mar 29 23:24:26 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Mon, 29 Mar 2010 18:24:26 -0500
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <fb5ec5091003290808g4b1c809cr47625316cd1c54cf@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
	<fb5ec5091003290808g4b1c809cr47625316cd1c54cf@mail.gmail.com>
Message-ID: <1704b7a21003291624n740dbc8bibbc15e1b8e0291d4@mail.gmail.com>

One advantage of this approach over the VM approach is that no data copy is
necessary when the capacity of the data structure is expanded (new arrays
are tacked on to the end of the top level array of references) or contracted
(arrays of null are removed from the top level array of references) aside
from the (any?) copy of the top level array of references.  One way to
address your concern, though, is to create a ChunkedArray class that simply
wraps an array and provides expand and contract functionality.  This could
be reused in any/all collections.

On Mon, Mar 29, 2010 at 10:08 AM, Osvaldo Doederlein <opinali at gmail.com>wrote:

> Initially, it would be good enough to replace only java.util.ArrayList with
> minimal overhead. ArrayList does not support efficient add-at-front or other
> enhancements of ArrayDeque; but ArrayList is still a much more important and
> popular collection, it's the primary "straight replacement for primitive
> arrrays" and I guess it should continue with that role.
>
> One problem of both ArrayList and primitive arrays is that they're not
> GC-friendly; huge arrays suck for GC. IBM's realtime Metronome collector
> uses the "arraylet" structure for primitive arrays, so there is a hard
> upper-limit on object size (well, at least as long as apps don't define
> classes with thousands of fields, I guess). This avoids the whole issue of
> "large objects" which permits a simpler heap layout, better incremental GC,
> etc. There are two tradeoffs. First, some overhead for all array operations
> - but this is the least important, remarkably as the arraylet trick is
> implement at the VM level so we can rely on the JIT to perform extra
> optimizations (e.g., unrolling and other loop optimizations; bounds-check
> elimination and other array opts, may be arraylet-aware so most overhead is
> cancelled or at least lifted out of loop bodies and hot paths.) Second, no
> support at all for huge arrays is incompatible with native code that expects
> a continuous layout, e.g. for the byte[]s inside Images - so all these uses
> must be identified and fixed somehow, e.g using DirectBuffers, or changing
> the native layer to understand arraylets (image libaries may be OK with
> banding), or in the worst case just copy the data to/from a continuous,
> native array (in most cases I think this copy already happens for other
> reasons, so there's no extra copy, just a slightly more expensive copy).
>
> Now we're talking about some big VM change of course, but HotSpot would not
> be the first production VM to do this so maybe it's a viable project for the
> future, remarkably as Sun plans to keep raising the bar in
> incremental/realtime GC (G1 may already be a great step forward, but huge
> arrays will always spoil the fun for many apps).
>
> In summary I think the ChunkedArrayList would serve only as a stopgap
> solution, with extremely limited benefits unless it's sufficiently good so
> like Martin says, we can replace more List implementations. And I'll even
> add, replace many other collections too - e.g. a giant HashMap will contain
> a giant Entry[] array inside it, I want this array to be chunked too
> (ConcurrentHashMap already is, but it's tuned up differently, for concurrent
> usage - and that's just one example anyway). And by "replace" I further mean
> "change the implementation of all existing collections that are
> array-backed", not "offer new collections" as the latter will only be
> heavily used ten years from today when JavaSE7 is considered the minimum
> JavaSE release to be supported by apps/libraries/frameworks/containers/etc.
> Even then, the benefits will be clearly inferior to what can be achireved by
> VM-level arraylets.
>
> A+
> Osvaldo
>
>
> 2010/3/29 Martin Buchholz <martinrb at google.com>
>
> On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com>
>> wrote:
>> > I put together the following class, ChunkedArrayList, in response to
>> > Martin's request (excerpted from an earlier conversation on this web
>> board)
>> > below.
>> >
>> >
>> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>> >
>> > Thoughts?
>>
>> This class is well on the way to what I was thinking of,
>> but my bar for acceptance is a little higher.
>> In particular, I don't want to add yet another class
>> that is can replace some, but not all of existing
>> list implementations.
>>
>> Most obviously, I don't want to lose the ability,
>> introduced in ArrayDeque, of having O(1) insertion
>> at the front and end of the collection.
>> Perhaps you can do this by having one "arraylet"
>> always be shared by both ends, which
>> grow towards each other in circular fashion.
>>
>> I also think we should shrink the array when
>> necessary, so that occupancy never drops
>> below, say 50%.
>>
>> Perhaps we should also have amortized O(1)
>> insertion in the middle by using a "gap array".
>> Probably more important for byte/char collections
>> like StringBuilder...
>>
>> I believe there are more complicated implementations
>> that permit O(1) insertions at the ends, and only
>> O(sqrt(N)) space overhead.
>>
>> ....
>>
>> E.g. Use your favorite search engine to do
>> some research on:
>> Resizable arrays in optimal time and space
>> Succinct dynamic data structures
>>
>> Meta-comment: there is not enough transfer of
>> academic research results into practice; I would think this
>> is one of the responsibilities of the researchers.
>>
>> I presume you'd be willing to sign a
>> contributor agreement to get your changes into
>> the JDK someday.
>>
>> Martin
>>
>> > Regards,
>> >
>> > Kevin
>> >
>> >
>> > On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
>> > wrote:
>> >
>> >     It surely is not a good idea to use a single backing array
>> >     for huge arrays.  As you point out, it's up to 32GB
>> >     for just one object.  But the core JDK
>> >     doesn't offer a suitable alternative for users who need very
>> >     large collections.
>> >
>> >     It would have been more in the spirit of Java to have a
>> >     collection class instead of ArrayList that was not fastest at
>> >     any particular operation, but had excellent asymptotic behaviour,
>> >     based on backing arrays containing backing arrays.
>> >     But:
>> >     - no such excellent class has been written yet
>> >      (or please point me to such a class)
>> >     - even if it were, such a best-of-breed-general-purpose
>> >      List implementation would probably need to be introduced as a
>> >      separate class, because of the performance expectations of
>> >      existing implementations.
>> >
>> >     In the meantime, we have to maintain what we got,
>> >     and that includes living with arrays and classes that wrap them.
>> >
>> >     Changing the spec is unlikely to succeed..
>> >
>> >     Martin
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100329/c718d23d/attachment.html>

From Weijun.Wang at Sun.COM  Tue Mar 30 08:08:27 2010
From: Weijun.Wang at Sun.COM (Weijun Wang)
Date: Tue, 30 Mar 2010 16:08:27 +0800
Subject: java.util.Pair
Message-ID: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>

Hi All

There are multiple CRs asking for a java.util.Pair class:

   4983155
   6229146
   4947273

I know such a simple thing can be made very complex and everyone might want to add a new method into it. How about we just make it most primitive? Simply an immutable and Serializable class, two final fields, one constructor, two getters (?), and no static factory methods. (S)he who does the real implementation has the privilege to choose between head/tail and car/cdr.

Thanks
Max


From brucechapman at paradise.net.nz  Tue Mar 30 09:03:44 2010
From: brucechapman at paradise.net.nz (Bruce Chapman)
Date: Tue, 30 Mar 2010 22:03:44 +1300
Subject: java.util.Pair
In-Reply-To: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
Message-ID: <4BB1BE70.2060901@paradise.net.nz>

Weijun Wang wrote:
> Hi All
>
> There are multiple CRs asking for a java.util.Pair class:
>
>    4983155
>    6229146
>    4947273
>
> I know such a simple thing can be made very complex and everyone might want to add a new method into it. How about we just make it most primitive? Simply an immutable and Serializable class, two final fields, one constructor, two getters (?), and no static factory methods. (S)he who does the real implementation has the privilege to choose between head/tail and car/cdr.
>
>   
or first/second or left/right or a/b or foo/bar or chalk/cheese

Bruce


> Thanks
> Max
>
>
>   


From kevin.l.stern at gmail.com  Tue Mar 30 11:25:41 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Tue, 30 Mar 2010 06:25:41 -0500
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
Message-ID: <1704b7a21003300425i7dd1ef7he28728ad3cdb60e2@mail.gmail.com>

Hi Martin,

Thanks much for your feedback.  The first approach that comes to mind to
implement O(1) time front as well as rear insertion is to create a cyclic
list structure with a front/rear pointer - to insert at the front requires
decrementing the front pointer (modulo the size) and to insert at the rear
requires incrementing the rear pointer (modulo the size).  We need to resize
when the two pointers bump into each other.  Could you explain more about
your suggestion of introducing an arraylet that is shared by the front and
the rear?  It's not clear to me how that would help and/or be a better
approach than the cyclic list.  Anyhow, the paper that you reference,
"Resizable arrays in optimal time and space", gives a deque so if we take
that approach then the deque is specified.

Shrinking the array is not a problem - this comes 'for free' (in the sense
that it's required) in the optimal space data structure that you reference.

Regarding the gap array suggestion, it is not clear to me how we will still
compute the correct arraylet/offset for an index in O(1) time if we have
arraylets of arbitrary size.  Even worse, if we go with the optimal space
data structure we will not have the option of creating arraylets of
arbitrary size or with arbitrary gaps between elements.

You are absolutely right about the n^(1/2) space overhead; I was not aware
of this research.  I'll go ahead and implement the structure defined in
"Resizable arrays in optimal time and space" (once I find some time to do
so).

Regards,

Kevin

On Mon, Mar 29, 2010 at 2:23 AM, Martin Buchholz <martinrb at google.com>wrote:

> On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
> > I put together the following class, ChunkedArrayList, in response to
> > Martin's request (excerpted from an earlier conversation on this web
> board)
> > below.
> >
> >
> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
> >
> > Thoughts?
>
> This class is well on the way to what I was thinking of,
> but my bar for acceptance is a little higher.
> In particular, I don't want to add yet another class
> that is can replace some, but not all of existing
> list implementations.
>
> Most obviously, I don't want to lose the ability,
> introduced in ArrayDeque, of having O(1) insertion
> at the front and end of the collection.
> Perhaps you can do this by having one "arraylet"
> always be shared by both ends, which
> grow towards each other in circular fashion.
>
> I also think we should shrink the array when
> necessary, so that occupancy never drops
> below, say 50%.
>
> Perhaps we should also have amortized O(1)
> insertion in the middle by using a "gap array".
> Probably more important for byte/char collections
> like StringBuilder...
>
> I believe there are more complicated implementations
> that permit O(1) insertions at the ends, and only
> O(sqrt(N)) space overhead.
>
> ....
>
> E.g. Use your favorite search engine to do
> some research on:
> Resizable arrays in optimal time and space
> Succinct dynamic data structures
>
> Meta-comment: there is not enough transfer of
> academic research results into practice; I would think this
> is one of the responsibilities of the researchers.
>
> I presume you'd be willing to sign a
> contributor agreement to get your changes into
> the JDK someday.
>
> Martin
>
> > Regards,
> >
> > Kevin
> >
> >
> > On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
> > wrote:
> >
> >     It surely is not a good idea to use a single backing array
> >     for huge arrays.  As you point out, it's up to 32GB
> >     for just one object.  But the core JDK
> >     doesn't offer a suitable alternative for users who need very
> >     large collections.
> >
> >     It would have been more in the spirit of Java to have a
> >     collection class instead of ArrayList that was not fastest at
> >     any particular operation, but had excellent asymptotic behaviour,
> >     based on backing arrays containing backing arrays.
> >     But:
> >     - no such excellent class has been written yet
> >      (or please point me to such a class)
> >     - even if it were, such a best-of-breed-general-purpose
> >      List implementation would probably need to be introduced as a
> >      separate class, because of the performance expectations of
> >      existing implementations.
> >
> >     In the meantime, we have to maintain what we got,
> >     and that includes living with arrays and classes that wrap them.
> >
> >     Changing the spec is unlikely to succeed..
> >
> >     Martin
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/84bfa952/attachment.html>

From kevin.l.stern at gmail.com  Tue Mar 30 11:46:17 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Tue, 30 Mar 2010 06:46:17 -0500
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1704b7a21003300425i7dd1ef7he28728ad3cdb60e2@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
	<1704b7a21003300425i7dd1ef7he28728ad3cdb60e2@mail.gmail.com>
Message-ID: <1704b7a21003300446y24f5524dwa61322af324bffd6@mail.gmail.com>

Just to state the obvious, though, operations will be somewhat slower with
the optimal space structure.  Retrieval, for instance, requires more than
simply a shift and a bit mask (although not too much more).

On Tue, Mar 30, 2010 at 6:25 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:

> Hi Martin,
>
> Thanks much for your feedback.  The first approach that comes to mind to
> implement O(1) time front as well as rear insertion is to create a cyclic
> list structure with a front/rear pointer - to insert at the front requires
> decrementing the front pointer (modulo the size) and to insert at the rear
> requires incrementing the rear pointer (modulo the size).  We need to resize
> when the two pointers bump into each other.  Could you explain more about
> your suggestion of introducing an arraylet that is shared by the front and
> the rear?  It's not clear to me how that would help and/or be a better
> approach than the cyclic list.  Anyhow, the paper that you reference,
> "Resizable arrays in optimal time and space", gives a deque so if we take
> that approach then the deque is specified.
>
> Shrinking the array is not a problem - this comes 'for free' (in the sense
> that it's required) in the optimal space data structure that you reference.
>
> Regarding the gap array suggestion, it is not clear to me how we will still
> compute the correct arraylet/offset for an index in O(1) time if we have
> arraylets of arbitrary size.  Even worse, if we go with the optimal space
> data structure we will not have the option of creating arraylets of
> arbitrary size or with arbitrary gaps between elements.
>
> You are absolutely right about the n^(1/2) space overhead; I was not aware
> of this research.  I'll go ahead and implement the structure defined in
> "Resizable arrays in optimal time and space" (once I find some time to do
> so).
>
> Regards,
>
> Kevin
>
>
> On Mon, Mar 29, 2010 at 2:23 AM, Martin Buchholz <martinrb at google.com>wrote:
>
>> On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com>
>> wrote:
>> > I put together the following class, ChunkedArrayList, in response to
>> > Martin's request (excerpted from an earlier conversation on this web
>> board)
>> > below.
>> >
>> >
>> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>> >
>> > Thoughts?
>>
>> This class is well on the way to what I was thinking of,
>> but my bar for acceptance is a little higher.
>> In particular, I don't want to add yet another class
>> that is can replace some, but not all of existing
>> list implementations.
>>
>> Most obviously, I don't want to lose the ability,
>> introduced in ArrayDeque, of having O(1) insertion
>> at the front and end of the collection.
>> Perhaps you can do this by having one "arraylet"
>> always be shared by both ends, which
>> grow towards each other in circular fashion.
>>
>> I also think we should shrink the array when
>> necessary, so that occupancy never drops
>> below, say 50%.
>>
>> Perhaps we should also have amortized O(1)
>> insertion in the middle by using a "gap array".
>> Probably more important for byte/char collections
>> like StringBuilder...
>>
>> I believe there are more complicated implementations
>> that permit O(1) insertions at the ends, and only
>> O(sqrt(N)) space overhead.
>>
>> ....
>>
>> E.g. Use your favorite search engine to do
>> some research on:
>> Resizable arrays in optimal time and space
>> Succinct dynamic data structures
>>
>> Meta-comment: there is not enough transfer of
>> academic research results into practice; I would think this
>> is one of the responsibilities of the researchers.
>>
>> I presume you'd be willing to sign a
>> contributor agreement to get your changes into
>> the JDK someday.
>>
>> Martin
>>
>> > Regards,
>> >
>> > Kevin
>> >
>> >
>> > On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
>> > wrote:
>> >
>> >     It surely is not a good idea to use a single backing array
>> >     for huge arrays.  As you point out, it's up to 32GB
>> >     for just one object.  But the core JDK
>> >     doesn't offer a suitable alternative for users who need very
>> >     large collections.
>> >
>> >     It would have been more in the spirit of Java to have a
>> >     collection class instead of ArrayList that was not fastest at
>> >     any particular operation, but had excellent asymptotic behaviour,
>> >     based on backing arrays containing backing arrays.
>> >     But:
>> >     - no such excellent class has been written yet
>> >      (or please point me to such a class)
>> >     - even if it were, such a best-of-breed-general-purpose
>> >      List implementation would probably need to be introduced as a
>> >      separate class, because of the performance expectations of
>> >      existing implementations.
>> >
>> >     In the meantime, we have to maintain what we got,
>> >     and that includes living with arrays and classes that wrap them.
>> >
>> >     Changing the spec is unlikely to succeed..
>> >
>> >     Martin
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/44d3b34d/attachment.html>

From Ulf.Zibis at gmx.de  Tue Mar 30 11:46:52 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 30 Mar 2010 13:46:52 +0200
Subject: Refactor String's exception handling
In-Reply-To: <1ccfd1c11003291517l4a46f260s5c78639244da6420@mail.gmail.com>
References: <4A95079A.8080803@gmx.de> <4BA007A4.2030907@sun.com>	
	<4BA3F0B5.1070404@gmx.de>	
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>	
	<4BA543A0.2060600@gmx.de>	
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>	
	<1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>	
	<4BA78CE8.9020107@gmx.de>	
	<1ccfd1c11003221527q29f61f7u700344a99d293ceb@mail.gmail.com>	
	<4BAE73B7.40101@gmx.de>
	<1ccfd1c11003291517l4a46f260s5c78639244da6420@mail.gmail.com>
Message-ID: <4BB1E4AC.9000307@gmx.de>

Am 30.03.2010 00:17, schrieb Martin Buchholz:
> Hi Ulf,
>
> I will sponsor your initiative to refactor the exception handling.
>
> Before this can go in, we should have just the exception handling
> changes contained in one patch, since it is such a big change.
>    

You mean, that I had "surreptitiously" included some beautification, 
even in the first patch?
Yes, often I can't resist, hit me. Example:
It seems, that someone before had tried to standardize the this-triple 
in String's constructors. Looking closer, you can see, that they 
slightly differ, so for my taste it looked best, ordering them in the 
member variables order, having the real value at first.

On the other hand, I think it's too much overhead, to manage separate 
bugs for such beautifications.
What you think is a reasonable threshold for such on-the-fly 
beautifications?


> I'd like you to try to port my related RangeCheckMicroBenchmark
> to string handling and hopefully demonstrate some measurable
> performance improvement.
>    

That would be great. :-)

> ----
>
> In the code below, I think some if's need to be changed to "else if"s.
> (but don't just fix it - make sure we have a failing test with your
> current code (you do run the regression tests religiously, right?))
>
> +    static void checkPositionIndexes(int srcLen, int begin, int end) {
> +        assert (srcLen>= 0);
> +        int index;
> +        if (begin<  0)
> +            index = begin;
> +        if (end>  srcLen)
> +            index = begin>srcLen ? begin:end-begin;
> +        if (end<  begin)
> +            index = begin>srcLen ? begin : end<0 ? end : end-begin;
> +        else
> +            return;
> +        throw new StringIndexOutOfBoundsException(index);
>    

Good catch. The throws, I had replaced, had implicated the elses before.
In Google code style it would have been:  if (begin < 0 || end > srcLen 
|| end < begin)

You seem to like how I merged the different variations into one central 
standard behaviour. Is that valid for AbstractStringBuilder too?
I think it best matches to current behavior.
Exception message refers to ...
1. <begin>, if begin itself is invalid referring to 0 and srcLen
2. <end>, if end itself is invalid referring to 0 and srcLen
3. <end-begin>, if end is invalid in combination with given begin
Alternative:
2+3. <end>, if end is invalid referring to 0 and srcLen or in 
combination with given begin

The alternative may be easier to track for the developers, but less 
compatible with current behaviour, and a likly negative value speaks 
kinda for itself.
In the checkSubsequence(..., offset, count) case, unfortunately there is 
a good chance to have positive values as result of offset+count.


> ----
> it's =>  its
>
> +     *              following values are referred in it's message:
>    

Yes.

> ----
>
> badIndex might be a better name for "index" below.
>
> +        int index;
> +        if (begin<  0)
> +            index = begin;
>    

Very good idea!

> ----
> Run at least the following tests
> (below is how I test this code myself)
>
> /home/martinrb/jct-tools/3.2.2_03/linux/bin/jtreg -v:nopass,fail
> -vmoption:-enablesystemassertions -automatic "-k:\!ignore"
> -testjdk:/usr/local/google/home/martin/ws/upstream/build/linux-amd64
> test/sun/nio/cs test/java/nio/charset test/java/lang/StringCoding
> test/java/lang/StringBuilder test/java/lang/StringBuffer
> test/java/lang/String test/java/lang/Appendable
>    

Unfortunately I still haven't managed to even partly build a patched JDK 
on my Windows notebook.
- CygWin crashes from too big work, e.g webrev on more than ~20 files.
- Very few support on <nb-projects-dev at openjdk.java.net> mailing list.
- I'm wondering, that there is so few collaboration between NetBeans and 
JDK developers in same software company.

So as workaround, I'm fine with running my patches via -Xbootclasspath 
in NetBeans IDE.
So running jtreg tests I don't know how.
I exclusively had written my test using JUnit, because there is a 
beautiful support from NetBeans.
I remember, there was a email from Mark Reinold some months ago, that 
JUnit tests are too supported by jtreg from now.

Maybe you have some suggestions to me.

> ----
>
> I think returning len below is too confusing.
> Just make the return type void.
>
> +    int checkPositionIndex(int index) {
> +        int len = count; // not sure, if JIT recognizes that it's final ?
> +        checkPositionIndex(len, index);
> +        return len;
> +    }
>    

Returning the len is to prevent from 2 times slowly loading the member 
variable into local register/variable.
 From performance side I think, we only have to choices. Using the 
return trick or dropping those convenient methods at all.
The latter would be faster for the interpreter and/or non inlined case.

> ----
>
> We will need a significant merge once I commit
> related changes.
>    

Maybe we could announce this on this list, so other's could decide, if 
they hurry to commit there changes before, or have to do there own merge 
later.

-Ulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/429bf002/attachment.html>

From Ulf.Zibis at gmx.de  Tue Mar 30 11:58:52 2010
From: Ulf.Zibis at gmx.de (Ulf Zibis)
Date: Tue, 30 Mar 2010 13:58:52 +0200
Subject: Pending Character-related work
In-Reply-To: <4BB1C477.1040604@sun.com>
References: <1ccfd1c11003291547k1bde83d7n9e8ed3cf367aa4a0@mail.gmail.com>
	<4BB1C477.1040604@sun.com>
Message-ID: <4BB1E77C.4050600@gmx.de>

I like to add, that all those bit-twiddling would be much easier, if we would get unsigned integers 
to Java.
The everywhere repeated and sometimes differently optimized compare against 0 would become superfluous.
This hope seems gone for JDK-7.
If it will come one day, we can start the twiddling again.

-Ulf


Am 30.03.2010 11:29, schrieb Masayoshi Okutsu:
> Hi Martin,
>
> I'm starting code review. I'm not an Oracle person yet, though. :-)
>
> Thanks,
> --
> Masayoshi
> Sun Microsystems K.K.
>
> On 3/30/2010 7:47 AM, Martin Buchholz wrote:
>> Hi Character team,
>>
>> Below is the (very long) list of pending changes in my queue to be
>> committed.
>>
>> ...
>>


From assembling.signals at yandex.ru  Tue Mar 30 12:39:12 2010
From: assembling.signals at yandex.ru (assembling signals)
Date: Tue, 30 Mar 2010 16:39:12 +0400
Subject: java.util.Pair
In-Reply-To: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
Message-ID: <226001269952752@webmail122.yandex.ru>

Hi!

Do you mean, it would be good to have a "standard" implementing class of the interface Map.Entry<K,V> ?
This does exist: AbstractMap.SimpleEntry<K,V>. Well, of course both the interface and the class are
somewhat 'hidden', but nevertheless, they do exist.


30.03.10, 16:08, "Weijun Wang" <Weijun.Wang at Sun.COM>:

> Hi All
>  
>  There are multiple CRs asking for a java.util.Pair class:
>  
>     4983155
>     6229146
>     4947273
>  
>  I know such a simple thing can be made very complex and everyone might want to add a new method into it. How about we just make it most primitive? Simply an immutable and Serializable class, two final fields, one constructor, two getters (?), and no static factory methods. (S)he who does the real implementation has the privilege to choose between head/tail and car/cdr.
>  
>  Thanks
>  Max
>  
>  
>  

-- 
????? ????? ??? http://mail.yandex.ru/nospam/sign


From opinali at gmail.com  Tue Mar 30 13:30:57 2010
From: opinali at gmail.com (Osvaldo Doederlein)
Date: Tue, 30 Mar 2010 10:30:57 -0300
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1704b7a21003291624n740dbc8bibbc15e1b8e0291d4@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
	<fb5ec5091003290808g4b1c809cr47625316cd1c54cf@mail.gmail.com>
	<1704b7a21003291624n740dbc8bibbc15e1b8e0291d4@mail.gmail.com>
Message-ID: <fb5ec5091003300630q39e8a5e6h92d6d349d29208f@mail.gmail.com>

The VM-based arraylet implementation is by design minimalistic: it only
splits large arrays into smaller ones, nothing more. You must still wrap
primitive arrays by collection APIs uif you want anything else, including
dynamic size. But the opportunity to get some extra VM help for dynamic
sizing is obvious. Consider C's realloc() function. The Java language
doesn't currently have a realloc()-like API, because it's generally useless
in a garbage-collected heap that most often does not use free lists, and
most often allows compaction (realloc() is largely a clever,
statistically-efficient trick to gain some performance back from fragmented
heaps).

Now, arraylets would enable a special kind of safe realloc() operation that
makes sense for primitive arrays: it would always return a new array, in the
sense that the "root" array (pointers to slices) is new; but sharing most of
the slices with the original array. So if you have a 100K-element array
(that needs a 100-element root array for 1K slices), and grow it into
150K-position, we only need to allocate new slices for the extra 50K
positions, plus the new 150-element root array. And we only need to copy
data from the 100 positions of the old root array to the new one (and maybe,
from a single slice in the end of the original array, if its size didn't
match the maximum slice size - but then the collections growing algorithm
could easily avoid this). Array shrinking is even easier.

Notice that the old root array is still a live object, and now its slices
are aliased by a new root array, but this is only potentially confusing,
it's not unsafe. Collections would encapsulate these arrays and not expose
any aliasing or sharing (by not keeping any reference to the original array
after resize operations).

A+
Osvaldo

2010/3/29 Kevin L. Stern <kevin.l.stern at gmail.com>

> One advantage of this approach over the VM approach is that no data copy is
> necessary when the capacity of the data structure is expanded (new arrays
> are tacked on to the end of the top level array of references) or contracted
> (arrays of null are removed from the top level array of references) aside
> from the (any?) copy of the top level array of references.  One way to
> address your concern, though, is to create a ChunkedArray class that simply
> wraps an array and provides expand and contract functionality.  This could
> be reused in any/all collections.
>
>
> On Mon, Mar 29, 2010 at 10:08 AM, Osvaldo Doederlein <opinali at gmail.com>wrote:
>
>> Initially, it would be good enough to replace only java.util.ArrayList
>> with minimal overhead. ArrayList does not support efficient add-at-front or
>> other enhancements of ArrayDeque; but ArrayList is still a much more
>> important and popular collection, it's the primary "straight replacement for
>> primitive arrrays" and I guess it should continue with that role.
>>
>> One problem of both ArrayList and primitive arrays is that they're not
>> GC-friendly; huge arrays suck for GC. IBM's realtime Metronome collector
>> uses the "arraylet" structure for primitive arrays, so there is a hard
>> upper-limit on object size (well, at least as long as apps don't define
>> classes with thousands of fields, I guess). This avoids the whole issue of
>> "large objects" which permits a simpler heap layout, better incremental GC,
>> etc. There are two tradeoffs. First, some overhead for all array operations
>> - but this is the least important, remarkably as the arraylet trick is
>> implement at the VM level so we can rely on the JIT to perform extra
>> optimizations (e.g., unrolling and other loop optimizations; bounds-check
>> elimination and other array opts, may be arraylet-aware so most overhead is
>> cancelled or at least lifted out of loop bodies and hot paths.) Second, no
>> support at all for huge arrays is incompatible with native code that expects
>> a continuous layout, e.g. for the byte[]s inside Images - so all these uses
>> must be identified and fixed somehow, e.g using DirectBuffers, or changing
>> the native layer to understand arraylets (image libaries may be OK with
>> banding), or in the worst case just copy the data to/from a continuous,
>> native array (in most cases I think this copy already happens for other
>> reasons, so there's no extra copy, just a slightly more expensive copy).
>>
>> Now we're talking about some big VM change of course, but HotSpot would
>> not be the first production VM to do this so maybe it's a viable project for
>> the future, remarkably as Sun plans to keep raising the bar in
>> incremental/realtime GC (G1 may already be a great step forward, but huge
>> arrays will always spoil the fun for many apps).
>>
>> In summary I think the ChunkedArrayList would serve only as a stopgap
>> solution, with extremely limited benefits unless it's sufficiently good so
>> like Martin says, we can replace more List implementations. And I'll even
>> add, replace many other collections too - e.g. a giant HashMap will contain
>> a giant Entry[] array inside it, I want this array to be chunked too
>> (ConcurrentHashMap already is, but it's tuned up differently, for concurrent
>> usage - and that's just one example anyway). And by "replace" I further mean
>> "change the implementation of all existing collections that are
>> array-backed", not "offer new collections" as the latter will only be
>> heavily used ten years from today when JavaSE7 is considered the minimum
>> JavaSE release to be supported by apps/libraries/frameworks/containers/etc.
>> Even then, the benefits will be clearly inferior to what can be achireved by
>> VM-level arraylets.
>>
>> A+
>> Osvaldo
>>
>>
>> 2010/3/29 Martin Buchholz <martinrb at google.com>
>>
>>  On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com>
>>> wrote:
>>> > I put together the following class, ChunkedArrayList, in response to
>>> > Martin's request (excerpted from an earlier conversation on this web
>>> board)
>>> > below.
>>> >
>>> >
>>> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>>> >
>>> > Thoughts?
>>>
>>> This class is well on the way to what I was thinking of,
>>> but my bar for acceptance is a little higher.
>>> In particular, I don't want to add yet another class
>>> that is can replace some, but not all of existing
>>> list implementations.
>>>
>>> Most obviously, I don't want to lose the ability,
>>> introduced in ArrayDeque, of having O(1) insertion
>>> at the front and end of the collection.
>>> Perhaps you can do this by having one "arraylet"
>>> always be shared by both ends, which
>>> grow towards each other in circular fashion.
>>>
>>> I also think we should shrink the array when
>>> necessary, so that occupancy never drops
>>> below, say 50%.
>>>
>>> Perhaps we should also have amortized O(1)
>>> insertion in the middle by using a "gap array".
>>> Probably more important for byte/char collections
>>> like StringBuilder...
>>>
>>> I believe there are more complicated implementations
>>> that permit O(1) insertions at the ends, and only
>>> O(sqrt(N)) space overhead.
>>>
>>> ....
>>>
>>> E.g. Use your favorite search engine to do
>>> some research on:
>>> Resizable arrays in optimal time and space
>>> Succinct dynamic data structures
>>>
>>> Meta-comment: there is not enough transfer of
>>> academic research results into practice; I would think this
>>> is one of the responsibilities of the researchers.
>>>
>>> I presume you'd be willing to sign a
>>> contributor agreement to get your changes into
>>> the JDK someday.
>>>
>>> Martin
>>>
>>> > Regards,
>>> >
>>> > Kevin
>>> >
>>> >
>>> > On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
>>> > wrote:
>>> >
>>> >     It surely is not a good idea to use a single backing array
>>> >     for huge arrays.  As you point out, it's up to 32GB
>>> >     for just one object.  But the core JDK
>>> >     doesn't offer a suitable alternative for users who need very
>>> >     large collections.
>>> >
>>> >     It would have been more in the spirit of Java to have a
>>> >     collection class instead of ArrayList that was not fastest at
>>> >     any particular operation, but had excellent asymptotic behaviour,
>>> >     based on backing arrays containing backing arrays.
>>> >     But:
>>> >     - no such excellent class has been written yet
>>> >      (or please point me to such a class)
>>> >     - even if it were, such a best-of-breed-general-purpose
>>> >      List implementation would probably need to be introduced as a
>>> >      separate class, because of the performance expectations of
>>> >      existing implementations.
>>> >
>>> >     In the meantime, we have to maintain what we got,
>>> >     and that includes living with arrays and classes that wrap them.
>>> >
>>> >     Changing the spec is unlikely to succeed..
>>> >
>>> >     Martin
>>> >
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/1e7e1ded/attachment.html>

From kevinb at google.com  Tue Mar 30 17:54:17 2010
From: kevinb at google.com (Kevin Bourrillion)
Date: Tue, 30 Mar 2010 10:54:17 -0700
Subject: java.util.Pair
In-Reply-To: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
Message-ID: <108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>

Pair is only a partial, flawed solution to a special case (n=2) of a very
significant problem: the disproportionate complexity of creating value types
in Java.  I support addressing the underlying problem in Java 8, and not
littering the API with dead-end solutions like Pair.


On Tue, Mar 30, 2010 at 1:08 AM, Weijun Wang <Weijun.Wang at sun.com> wrote:

> Hi All
>
> There are multiple CRs asking for a java.util.Pair class:
>
>   4983155
>   6229146
>   4947273
>
> I know such a simple thing can be made very complex and everyone might want
> to add a new method into it. How about we just make it most primitive?
> Simply an immutable and Serializable class, two final fields, one
> constructor, two getters (?), and no static factory methods. (S)he who does
> the real implementation has the privilege to choose between head/tail and
> car/cdr.
>
> Thanks
> Max
>
>


-- 
Kevin Bourrillion @ Google
internal:  http://goto/javalibraries
external: http://guava-libraries.googlecode.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/cf1d275a/attachment.html>

From scolebourne at joda.org  Tue Mar 30 20:39:00 2010
From: scolebourne at joda.org (Stephen Colebourne)
Date: Tue, 30 Mar 2010 16:39:00 -0400
Subject: java.util.Pair
In-Reply-To: <108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
Message-ID: <4b4f45e01003301339k5f110a74o2831de234b97d381@mail.gmail.com>

While I support Kevin?s summary, having a public implementation of
Map.Entry in java.util would be very useful. (Along with making other
private classes public - unmodifiable iterator is one IIRC)
Stephen

On 30 March 2010 13:54, Kevin Bourrillion <kevinb at google.com> wrote:
> Pair is only a partial, flawed solution to a special case (n=2) of a very
> significant problem: the disproportionate complexity of creating value types
> in Java. ?I support addressing the underlying problem in Java 8, and not
> littering the API with dead-end solutions like Pair.
>
>
> On Tue, Mar 30, 2010 at 1:08 AM, Weijun Wang <Weijun.Wang at sun.com> wrote:
>>
>> Hi All
>>
>> There are multiple CRs asking for a java.util.Pair class:
>>
>> ? 4983155
>> ? 6229146
>> ? 4947273
>>
>> I know such a simple thing can be made very complex and everyone might
>> want to add a new method into it. How about we just make it most primitive?
>> Simply an immutable and Serializable class, two final fields, one
>> constructor, two getters (?), and no static factory methods. (S)he who does
>> the real implementation has the privilege to choose between head/tail and
>> car/cdr.
>>
>> Thanks
>> Max
>>
>
>
>
> --
> Kevin Bourrillion @ Google
> internal:? http://goto/javalibraries
> external: http://guava-libraries.googlecode.com
>
>


From martinrb at google.com  Tue Mar 30 20:55:06 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 30 Mar 2010 13:55:06 -0700
Subject: java.util.Pair
In-Reply-To: <4b4f45e01003301339k5f110a74o2831de234b97d381@mail.gmail.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
	<4b4f45e01003301339k5f110a74o2831de234b97d381@mail.gmail.com>
Message-ID: <1ccfd1c11003301355v7a9975feoaf1aebe05fdf688b@mail.gmail.com>

On Tue, Mar 30, 2010 at 13:39, Stephen Colebourne <scolebourne at joda.org> wrote:
> While I support Kevin?s summary, having a public implementation of
> Map.Entry in java.util would be very useful. (Along with making other
> private classes public - unmodifiable iterator is one IIRC)

./AbstractMap.java:569:    public static class SimpleEntry<K,V>
./AbstractMap.java:699:    public static class SimpleImmutableEntry<K,V>

---

Which unmodifiable iterator?


From scolebourne at joda.org  Tue Mar 30 21:01:57 2010
From: scolebourne at joda.org (Stephen Colebourne)
Date: Tue, 30 Mar 2010 17:01:57 -0400
Subject: java.util.Pair
In-Reply-To: <1ccfd1c11003301355v7a9975feoaf1aebe05fdf688b@mail.gmail.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
	<4b4f45e01003301339k5f110a74o2831de234b97d381@mail.gmail.com>
	<1ccfd1c11003301355v7a9975feoaf1aebe05fdf688b@mail.gmail.com>
Message-ID: <4b4f45e01003301401s1641c714qa6e4a66aab6a9b81@mail.gmail.com>

(I?m writing from a slow connection in a national park in Chile)

I meant a decortator for an iterator that wraps the original making it
immutable.

Stephen

On 30 March 2010 16:55, Martin Buchholz <martinrb at google.com> wrote:
> On Tue, Mar 30, 2010 at 13:39, Stephen Colebourne <scolebourne at joda.org> wrote:
>> While I support Kevin?s summary, having a public implementation of
>> Map.Entry in java.util would be very useful. (Along with making other
>> private classes public - unmodifiable iterator is one IIRC)
>
> ./AbstractMap.java:569: ? ?public static class SimpleEntry<K,V>
> ./AbstractMap.java:699: ? ?public static class SimpleImmutableEntry<K,V>
>
> ---
>
> Which unmodifiable iterator?
>


From martinrb at google.com  Tue Mar 30 22:20:22 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 30 Mar 2010 15:20:22 -0700
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1704b7a21003300425i7dd1ef7he28728ad3cdb60e2@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
	<1704b7a21003300425i7dd1ef7he28728ad3cdb60e2@mail.gmail.com>
Message-ID: <1ccfd1c11003301520g564876fehfce57def62f6d6b3@mail.gmail.com>

On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> Hi Martin,
>
> Thanks much for your feedback.? The first approach that comes to mind to
> implement O(1) time front as well as rear insertion is to create a cyclic
> list structure with a front/rear pointer - to insert at the front requires
> decrementing the front pointer (modulo the size) and to insert at the rear
> requires incrementing the rear pointer (modulo the size).? We need to resize
> when the two pointers bump into each other.? Could you explain more about
> your suggestion of introducing an arraylet that is shared by the front and
> the rear?

It was a half-baked idea - I don't know if there's a way to turn it into
something useful.  I was thinking of the ArrayDeque implementation,
where all the elements live in a single array.

>  It's not clear to me how that would help and/or be a better
> approach than the cyclic list.? Anyhow, the paper that you reference,
> "Resizable arrays in optimal time and space", gives a deque so if we take
> that approach then the deque is specified.

Technically, ArrayList also supports the Deque operations -
just not efficiently.


From ben_manes at yahoo.com  Tue Mar 30 22:45:34 2010
From: ben_manes at yahoo.com (Ben Manes)
Date: Tue, 30 Mar 2010 15:45:34 -0700 (PDT)
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1704b7a21003300425i7dd1ef7he28728ad3cdb60e2@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
	<1704b7a21003300425i7dd1ef7he28728ad3cdb60e2@mail.gmail.com>
Message-ID: <575486.92749.qm@web38807.mail.mud.yahoo.com>

You might be able to take some ideas from the VList data structure and zipper model. The research on persistent data structures tend to have some fairly interesting ideas and some approaches might work well here.


________________________________
From: Kevin L. Stern <kevin.l.stern at gmail.com>
To: Martin Buchholz <martinrb at google.com>
Cc: core-libs-dev at openjdk.java.net
Sent: Tue, March 30, 2010 4:25:41 AM
Subject: Re: A List implementation backed by multiple small arrays rather than the traditional single large array.

Hi Martin,

Thanks much for your feedback.  The first approach 
that comes to mind to implement O(1) time front as well as rear 
insertion is to create a cyclic list structure with a 
front/rear pointer - to insert at the front requires decrementing the 
front pointer (modulo the size) and to insert at the rear requires 
incrementing the rear pointer (modulo the size).  We need to resize when the two pointers bump into each other.  Could you explain more about 
your suggestion of introducing an arraylet that is shared by the front 
and the rear?  It's not clear to me how that would help and/or be a better approach than the cyclic list.  Anyhow, the 
paper that you reference, "Resizable arrays in optimal time and space", 
gives a deque so if we take that approach then the deque is specified.

Shrinking the array is not a problem - this comes 'for free' (in the sense that it's required) in the optimal space data structure that you reference.

Regarding the gap array suggestion, it is not clear to me how we will still 
compute the correct arraylet/offset for an index in O(1) time if we have arraylets of arbitrary size.  Even worse, if we go with the optimal 
space data structure we will not have the option of creating arraylets 
of arbitrary size or with arbitrary gaps between elements.

You are absolutely right about the n^(1/2) space overhead; I was not aware of this research.  I'll go ahead and implement the structure 
defined in "Resizable arrays in optimal time and space" (once I find some time to do so).

Regards,

Kevin


On Mon, Mar 29, 2010 at 2:23 AM, Martin Buchholz <martinrb at google.com> wrote:

On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
>>> I put together the following class, ChunkedArrayList, in response to
>>> Martin's request (excerpted from an earlier conversation on this web board)
>>> below.
>>>
>>> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>>
>>
>>> Thoughts?
>
>This class is well on the way to what I was thinking of,
>>but my bar for acceptance is a little higher.
>>In particular, I don't want to add yet another class
>>that is can replace some, but not all of existing
>>list implementations.
>
>>Most obviously, I don't want to lose the ability,
>>introduced in ArrayDeque, of having O(1) insertion
>>at the front and end of the collection.
>>Perhaps you can do this by having one "arraylet"
>>always be shared by both ends, which
>>grow towards each other in circular fashion.
>
>>I also think we should shrink the array when
>>necessary, so that occupancy never drops
>>below, say 50%.
>
>>Perhaps we should also have amortized O(1)
>>insertion in the middle by using a "gap array".
>>Probably more important for byte/char collections
>>like StringBuilder...
>
>>I believe there are more complicated implementations
>>that permit O(1) insertions at the ends, and only
>>O(sqrt(N)) space overhead.
>
>>....
>
>>E.g. Use your favorite search engine to do
>>some research on:
>>Resizable arrays in optimal time and space
>>Succinct dynamic data structures
>
>>Meta-comment: there is not enough transfer of
>>academic research results into practice; I would think this
>>is one of the responsibilities of the researchers.
>
>>I presume you'd be willing to sign a
>>contributor agreement to get your changes into
>>the JDK someday.
>
>>Martin
>
>
>>> Regards,
>>>
>>> Kevin
>>>
>>>
>>> On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
>>> wrote:
>>>
>>>     It surely is not a good idea to use a single backing array
>>>     for huge arrays.  As you point out, it's up to 32GB
>>>     for just one object.  But the core JDK
>>>     doesn't offer a suitable alternative for users who need very
>>>     large collections.
>>>
>>>     It would have been more in the spirit of Java to have a
>>>     collection class instead of ArrayList that was not fastest at
>>>     any particular operation, but had excellent asymptotic behaviour,
>>>     based on backing arrays containing backing arrays.
>>>     But:
>>>     - no such excellent class has been written yet
>>>      (or please point me to such a class)
>>>     - even if it were, such a best-of-breed-general-purpose
>>>      List implementation would probably need to be introduced as a
>>>      separate class, because of the performance expectations of
>>>      existing implementations.
>>>
>>>     In the meantime, we have to maintain what we got,
>>>     and that includes living with arrays and classes that wrap them.
>>>
>>>     Changing the spec is unlikely to succeed..
>>>
>>>     Martin
>>>
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/69719c34/attachment.html>

From jason_mehrens at hotmail.com  Tue Mar 30 23:11:58 2010
From: jason_mehrens at hotmail.com (Jason Mehrens)
Date: Tue, 30 Mar 2010 18:11:58 -0500
Subject: java.util.Pair
In-Reply-To: <4b4f45e01003301401s1641c714qa6e4a66aab6a9b81@mail.gmail.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>,
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>,
	<4b4f45e01003301339k5f110a74o2831de234b97d381@mail.gmail.com>,
	<1ccfd1c11003301355v7a9975feoaf1aebe05fdf688b@mail.gmail.com>,
	<4b4f45e01003301401s1641c714qa6e4a66aab6a9b81@mail.gmail.com>
Message-ID: <SNT114-W82F3353E133F3BAA6A0B9831F0@phx.gbl>


Stephen,

 
I'm all for adding support for unmodifiableIterable, unmodifableNavigableMap, and unmodifableNavigableSet.  However, I think adding public access to such a iterator decorator goes against the guidelines of the collections design faq (4 and 5):

http://java.sun.com/javase/6/docs/technotes/guides/collections/designfaq.html#8

 
So following the guideline of never passing an Iterator around, that leaves you with the following:

1. Your custom container is backed by collection, use Collections.unmodifiableXXX(this.internal).iterator()

2. Your custom container is backed by an array, use Arrays.asList, followed by point 1.

3. Your custom container has specialized layout, you have to write an iterator with a remove implementation that removes or throws and the unmodifable one is easy to write.

 
Assuming that JDK had unmodifiableIterable decorator, is there is there a corner case that I'm not seeing or is the main reservation the extra method calls and creation of some well behaved garbage?

 
Jason
 
> Date: Tue, 30 Mar 2010 17:01:57 -0400
> Subject: Re: java.util.Pair
> From: scolebourne at joda.org
> To: core-libs-dev at openjdk.java.net
> 
> (I?m writing from a slow connection in a national park in Chile)
> 
> I meant a decortator for an iterator that wraps the original making it
> immutable.
> 
> Stephen
 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with Microsoft?s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/210850552/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/5700af6e/attachment.html>

From joe.darcy at Oracle.com  Tue Mar 30 23:34:52 2010
From: joe.darcy at Oracle.com (joe.darcy at Oracle.com)
Date: Tue, 30 Mar 2010 16:34:52 -0700
Subject: java.util.Pair
In-Reply-To: <108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
Message-ID: <4BB28A9C.10705@oracle.com>


On 3/30/2010 10:54 AM, Kevin Bourrillion wrote:
> Pair is only a partial, flawed solution to a special case (n=2) of a 
> very significant problem: the disproportionate complexity of creating 
> value types in Java.  I support addressing the underlying problem in 
> Java 8, and not littering the API with dead-end solutions like Pair.

While I have sympathy with that conclusion, there is the
side-effect of littering many APIs with the flotsam of lots of different
classes named "Pair."  My inclination would be to produce one adequate
Pair class in the JDK to prevent the proliferation of yet more Pair 
classes in other code bases.

I should know better than to take the bait, below is a first cut at
java.util.Pair.

-Joe

package java.util;

import java.util.Objects;

/**
  * An immutable pair of values.  The values may be null.  The values
  * themselves may be mutable.
  *
  * @param <A> the type of the first element of the pair
  * @param <B> the type of the second element of the pair
  *
  * @since 1.7
  */
public final class Pair<A, B> {
     private final A a;
     private final B b;

     private Pair(A a, B b) {
	this.a = a;
	this.b = b;
     }

     /**
      * Returns a pair whose elements are the first and second
      * arguments, respectively.
      * @return a pair constructed from the arguments
      */
     public static <C, D> Pair<C, D> valueOf(C c, D d) {
	// Don't mandate new values.
	return new Pair<C, D>(c, d);
     }

     /**
      * Returns the value of the first element of the pair.
      * @return the value of the first element of the pair
      */
     public A getA() {
	return a;
     }

     /**
      * Returns the value of the second element of the pair.
      * @return the value of the second element of the pair
      */
     public B getB() {
	return b;
     }

     /**
      * TBD
      */
     @Override
     public String toString() {
	return "[" + Objects.toString(a) + ", " + Objects.toString(b) + "]";
     }

     /**
      * TBD
      */
     @Override
     public boolean equals(Object x) {
	if (!(x instanceof Pair))
	    return false;
	else {
	    Pair<?,?> that = (Pair<?,?>) x;
	    return
		Objects.equals(this.a, that.a) &&
		Objects.equals(this.b, that.b);
	}
     }

     /**
      * TBD
      */
     @Override
     public int hashCode() {
	return Objects.hash(a, b);
     }
}


From xueming.shen at sun.com  Wed Mar 31 02:15:11 2010
From: xueming.shen at sun.com (xueming.shen at sun.com)
Date: Wed, 31 Mar 2010 02:15:11 +0000
Subject: hg: jdk7/tl/jdk: 6902790: Converting/displaying HKSCs characters
	issue on Vista and Windows7; ...
Message-ID: <20100331021524.C3C5044ABB@hg.openjdk.java.net>

Changeset: 3771ac2a8b3b
Author:    sherman
Date:      2010-03-30 19:10 -0700
URL:       http://hg.openjdk.java.net/jdk7/tl/jdk/rev/3771ac2a8b3b

6902790: Converting/displaying HKSCs characters issue on Vista and Windows7
6911753: NSN wants to add Big5 HKSCS-2004 support
Summary: support HKSCS2008 in Big5_HKSCS and MS950_HKSCS
Reviewed-by: okutsu

! make/sun/nio/cs/FILES_java.gmk
! make/sun/nio/cs/Makefile
+ make/tools/CharsetMapping/Big5.c2b
+ make/tools/CharsetMapping/Big5.map
+ make/tools/CharsetMapping/Big5.nr
+ make/tools/CharsetMapping/HKSCS2001.c2b
+ make/tools/CharsetMapping/HKSCS2001.map
+ make/tools/CharsetMapping/HKSCS2008.c2b
+ make/tools/CharsetMapping/HKSCS2008.map
+ make/tools/CharsetMapping/HKSCS_XP.c2b
+ make/tools/CharsetMapping/HKSCS_XP.map
! make/tools/CharsetMapping/dbcs
- make/tools/src/build/tools/charsetmapping/CharsetMapping.java
+ make/tools/src/build/tools/charsetmapping/DBCS.java
+ make/tools/src/build/tools/charsetmapping/EUC_TW.java
- make/tools/src/build/tools/charsetmapping/GenerateDBCS.java
- make/tools/src/build/tools/charsetmapping/GenerateEUC_TW.java
- make/tools/src/build/tools/charsetmapping/GenerateMapping.java
- make/tools/src/build/tools/charsetmapping/GenerateSBCS.java
+ make/tools/src/build/tools/charsetmapping/HKSCS.java
+ make/tools/src/build/tools/charsetmapping/JIS0213.java
! make/tools/src/build/tools/charsetmapping/Main.java
+ make/tools/src/build/tools/charsetmapping/SBCS.java
+ make/tools/src/build/tools/charsetmapping/Utils.java
! src/share/classes/sun/awt/HKSCS.java
! src/share/classes/sun/io/ByteToCharBig5.java
! src/share/classes/sun/io/ByteToCharBig5_HKSCS.java
! src/share/classes/sun/io/ByteToCharBig5_Solaris.java
- src/share/classes/sun/io/ByteToCharHKSCS.java
- src/share/classes/sun/io/ByteToCharHKSCS_2001.java
! src/share/classes/sun/io/ByteToCharMS950_HKSCS.java
! src/share/classes/sun/io/CharToByteBig5.java
! src/share/classes/sun/io/CharToByteBig5_HKSCS.java
! src/share/classes/sun/io/CharToByteBig5_Solaris.java
- src/share/classes/sun/io/CharToByteHKSCS.java
- src/share/classes/sun/io/CharToByteHKSCS_2001.java
! src/share/classes/sun/io/CharToByteMS950_HKSCS.java
- src/share/classes/sun/nio/cs/ext/Big5.java
! src/share/classes/sun/nio/cs/ext/Big5_HKSCS.java
+ src/share/classes/sun/nio/cs/ext/Big5_HKSCS_2001.java
! src/share/classes/sun/nio/cs/ext/Big5_Solaris.java
! src/share/classes/sun/nio/cs/ext/ExtendedCharsets.java
! src/share/classes/sun/nio/cs/ext/HKSCS.java
- src/share/classes/sun/nio/cs/ext/HKSCS_2001.java
! src/share/classes/sun/nio/cs/ext/MS950_HKSCS.java
+ src/share/classes/sun/nio/cs/ext/MS950_HKSCS_XP.java
! src/solaris/classes/sun/awt/fontconfigs/solaris.fontconfig.properties
! src/solaris/native/java/lang/java_props_md.c
! src/windows/classes/sun/awt/windows/fontconfig.properties
! src/windows/native/java/lang/java_props_md.c
! test/java/nio/charset/Charset/NIOCharsetAvailabilityTest.java
! test/java/nio/charset/Charset/RegisteredCharsets.java


From martinrb at google.com  Wed Mar 31 04:41:37 2010
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 30 Mar 2010 21:41:37 -0700
Subject: Refactor String's exception handling
In-Reply-To: <4BB1E4AC.9000307@gmx.de>
References: <4A95079A.8080803@gmx.de>
	<1ccfd1c11003191713w7178db28u161fd7c42127a775@mail.gmail.com>
	<4BA543A0.2060600@gmx.de>
	<1ccfd1c11003210056r13140d02kedc569722567ea2e@mail.gmail.com>
	<1ccfd1c11003211239h2105e5f1m903dd5d3fbf5387b@mail.gmail.com>
	<4BA78CE8.9020107@gmx.de>
	<1ccfd1c11003221527q29f61f7u700344a99d293ceb@mail.gmail.com>
	<4BAE73B7.40101@gmx.de>
	<1ccfd1c11003291517l4a46f260s5c78639244da6420@mail.gmail.com>
	<4BB1E4AC.9000307@gmx.de>
Message-ID: <1ccfd1c11003302141v206eff50wd4497fa89c93539d@mail.gmail.com>

On Tue, Mar 30, 2010 at 04:46, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 30.03.2010 00:17, schrieb Martin Buchholz:
>
> Hi Ulf,
>
> I will sponsor your initiative to refactor the exception handling.
>
> Before this can go in, we should have just the exception handling
> changes contained in one patch, since it is such a big change.
>
>
> You mean, that I had "surreptitiously" included some beautification, even in
> the first patch?
> Yes, often I can't resist, hit me. Example:
> It seems, that someone before had tried to standardize the this-triple in
> String's constructors. Looking closer, you can see, that they slightly
> differ, so for my taste it looked best, ordering them in the member
> variables order, having the real value at first.

Yes, it's a tough question as to how finely to split changes.
The overhead of creating separate changes (must have a bug ID)
is unfortunately higher than we'd like.

That said, there are big advantages of separating out
purely cosmetic large changes.  E.g. we can verify
that the generated bytecode is identical.
This becomes much more important for pervasive
mechanical changes, like changing @exception => @throws.

> On the other hand, I think it's too much overhead, to manage separate bugs
> for such beautifications.
> What you think is a reasonable threshold for such on-the-fly
> beautifications?

> You seem to like how I merged the different variations into one central
> standard behaviour. Is that valid for AbstractStringBuilder too?
> I think it best matches to current behavior.
> Exception message refers to ...
> 1. <begin>, if begin itself is invalid referring to 0 and srcLen
> 2. <end>, if end itself is invalid referring to 0 and srcLen
> 3. <end-begin>, if end is invalid in combination with given begin
> Alternative:
> 2+3. <end>, if end is invalid referring to 0 and srcLen or in combination
> with given begin

Better detail messages are slightly incompatible,
but helpful for most users.  Should we switch?
It depends on how much we value compatibility.
Probably the JDK culture is still too conservative.

> ----
> Run at least the following tests
> (below is how I test this code myself)
>
> /home/martinrb/jct-tools/3.2.2_03/linux/bin/jtreg -v:nopass,fail
> -vmoption:-enablesystemassertions -automatic "-k:\!ignore"
> -testjdk:/usr/local/google/home/martin/ws/upstream/build/linux-amd64
> test/sun/nio/cs test/java/nio/charset test/java/lang/StringCoding
> test/java/lang/StringBuilder test/java/lang/StringBuffer
> test/java/lang/String test/java/lang/Appendable
>
>
> Unfortunately I still haven't managed to even partly build a patched JDK on
> my Windows notebook.

It's fine to run javac + use -Xbootclasspath.

> - CygWin crashes from too big work, e.g webrev on more than ~20 files.
> - Very few support on <nb-projects-dev at openjdk.java.net> mailing list.
> - I'm wondering, that there is so few collaboration between NetBeans and JDK
> developers in same software company.
>
> So as workaround, I'm fine with running my patches via -Xbootclasspath in
> NetBeans IDE.
> So running jtreg tests I don't know how.

You should really learn how to run JDK tests.
JDK development on Linux is easier than development on Windows,
but it certainly should be possible on Windows.
Recent versions of jtreg are probably easier to run on Windows.
http://openjdk.java.net/jtreg/index.html

> I exclusively had written my test using JUnit, because there is a beautiful
> support from NetBeans.
> I remember, there was a email from Mark Reinold some months ago, that JUnit
> tests are too supported by jtreg from now.

I would be interested in that as well.

> Maybe you have some suggestions to me.
>
> ----
>
> I think returning len below is too confusing.
> Just make the return type void.
>
> +    int checkPositionIndex(int index) {
> +        int len = count; // not sure, if JIT recognizes that it's final ?
> +        checkPositionIndex(len, index);
> +        return len;
> +    }
>
>
> Returning the len is to prevent from 2 times slowly loading the member
> variable into local register/variable.
> From performance side I think, we only have to choices. Using the return
> trick or dropping those convenient methods at all.
> The latter would be faster for the interpreter and/or non inlined case.

In core libraries we often make engineering decisions to use
trickier or more verbose code for the sake of performance,
but I think this is going over the line.
I think you can rely on inlining of such small, always called,
methods like checkPositionIndex.

> ----
>
> We will need a significant merge once I commit
> related changes.
>
>
> Maybe we could announce this on this list, so other's could decide, if they
> hurry to commit there changes before, or have to do there own merge later.

I run into lots of merge conflicts, but always with my own
changes!  I don't think we have a lot of contention.

Martin


From forax at univ-mlv.fr  Wed Mar 31 07:31:22 2010
From: forax at univ-mlv.fr (=?UTF-8?B?UsOpbWkgRm9yYXg=?=)
Date: Wed, 31 Mar 2010 09:31:22 +0200
Subject: java.util.Pair
In-Reply-To: <4BB28A9C.10705@oracle.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
	<4BB28A9C.10705@oracle.com>
Message-ID: <4BB2FA4A.6090809@univ-mlv.fr>

Le 31/03/2010 01:34, joe.darcy at Oracle.com a ?crit :
>
>
> On 3/30/2010 10:54 AM, Kevin Bourrillion wrote:
>> Pair is only a partial, flawed solution to a special case (n=2) of a 
>> very significant problem: the disproportionate complexity of creating 
>> value types in Java.  I support addressing the underlying problem in 
>> Java 8, and not littering the API with dead-end solutions like Pair.
>
> While I have sympathy with that conclusion, there is the
> side-effect of littering many APIs with the flotsam of lots of different
> classes named "Pair."  My inclination would be to produce one adequate
> Pair class in the JDK to prevent the proliferation of yet more Pair 
> classes in other code bases.
>
> I should know better than to take the bait, below is a first cut at
> java.util.Pair.

In equals, instanceof Pair should be instanceof Pair<?,?>.
Pair is a raw type.

getA()/getB should be renamed to getFirst()/getSecond(),
according to their javadoc.

Object.toString() is not necessary in Pair.toString() because
StringBuilder.append (in fact String.valueOf()) already
returns "null" for null.
And minor optimisation, ']' can be used instead of "]".

public String toString() {
     return "[" +a + ", " + b + ']';
     }

>
> -Joe

R?mi


From Weijun.Wang at Sun.COM  Wed Mar 31 07:32:12 2010
From: Weijun.Wang at Sun.COM (Weijun Wang)
Date: Wed, 31 Mar 2010 15:32:12 +0800
Subject: java.util.Pair
In-Reply-To: <4BB28A9C.10705@oracle.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
	<4BB28A9C.10705@oracle.com>
Message-ID: <1CA21FA1-0EE9-4973-A0B7-E02C1E8F9861@Sun.COM>

"implements Serializable"?

-Max

On Mar 31, 2010, at 7:34 AM, joe.darcy at Oracle.com wrote:

> 
> 
> On 3/30/2010 10:54 AM, Kevin Bourrillion wrote:
>> Pair is only a partial, flawed solution to a special case (n=2) of a very significant problem: the disproportionate complexity of creating value types in Java.  I support addressing the underlying problem in Java 8, and not littering the API with dead-end solutions like Pair.
> 
> While I have sympathy with that conclusion, there is the
> side-effect of littering many APIs with the flotsam of lots of different
> classes named "Pair."  My inclination would be to produce one adequate
> Pair class in the JDK to prevent the proliferation of yet more Pair classes in other code bases.
> 
> I should know better than to take the bait, below is a first cut at
> java.util.Pair.
> 
> -Joe
> 
> package java.util;
> 
> import java.util.Objects;
> 
> /**
> * An immutable pair of values.  The values may be null.  The values
> * themselves may be mutable.
> *
> * @param <A> the type of the first element of the pair
> * @param <B> the type of the second element of the pair
> *
> * @since 1.7
> */
> public final class Pair<A, B> {
>    private final A a;
>    private final B b;
> 
>    private Pair(A a, B b) {
> 	this.a = a;
> 	this.b = b;
>    }
> 
>    /**
>     * Returns a pair whose elements are the first and second
>     * arguments, respectively.
>     * @return a pair constructed from the arguments
>     */
>    public static <C, D> Pair<C, D> valueOf(C c, D d) {
> 	// Don't mandate new values.
> 	return new Pair<C, D>(c, d);
>    }
> 
>    /**
>     * Returns the value of the first element of the pair.
>     * @return the value of the first element of the pair
>     */
>    public A getA() {
> 	return a;
>    }
> 
>    /**
>     * Returns the value of the second element of the pair.
>     * @return the value of the second element of the pair
>     */
>    public B getB() {
> 	return b;
>    }
> 
>    /**
>     * TBD
>     */
>    @Override
>    public String toString() {
> 	return "[" + Objects.toString(a) + ", " + Objects.toString(b) + "]";
>    }
> 
>    /**
>     * TBD
>     */
>    @Override
>    public boolean equals(Object x) {
> 	if (!(x instanceof Pair))
> 	    return false;
> 	else {
> 	    Pair<?,?> that = (Pair<?,?>) x;
> 	    return
> 		Objects.equals(this.a, that.a) &&
> 		Objects.equals(this.b, that.b);
> 	}
>    }
> 
>    /**
>     * TBD
>     */
>    @Override
>    public int hashCode() {
> 	return Objects.hash(a, b);
>    }
> }


From kevinb at google.com  Wed Mar 31 15:34:58 2010
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 31 Mar 2010 08:34:58 -0700
Subject: java.util.Pair
In-Reply-To: <4BB2FA4A.6090809@univ-mlv.fr>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
	<4BB28A9C.10705@oracle.com> <4BB2FA4A.6090809@univ-mlv.fr>
Message-ID: <z2r108fcdeb1003310834lee0d1795v29030356de52d4bd@mail.gmail.com>

On Wed, Mar 31, 2010 at 12:31 AM, R?mi Forax <forax at univ-mlv.fr> wrote:

In equals, instanceof Pair should be instanceof Pair<?,?>.
> Pair is a raw type.
>

Tangent: there are those of us who believe javac is quite mistaken to issue
a warning on 'instanceof Pair'.  (And even if it were right in theory (which
I don't think it is), weren't warnings supposed to be things that would warn
you about possible *bugs*?)


-- 
Kevin Bourrillion @ Google
internal:  http://goto/javalibraries
external: http://guava-libraries.googlecode.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/b5f20ad9/attachment.html>

From crazybob at crazybob.org  Wed Mar 31 15:36:26 2010
From: crazybob at crazybob.org (Bob Lee)
Date: Wed, 31 Mar 2010 08:36:26 -0700
Subject: java.util.Pair
In-Reply-To: <4BB28A9C.10705@oracle.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
	<4BB28A9C.10705@oracle.com>
Message-ID: <s2qa74683f91003310836z63022a98hce38aefe9c96b166@mail.gmail.com>

On Tue, Mar 30, 2010 at 4:34 PM, <joe.darcy at oracle.com> wrote:

> While I have sympathy with that conclusion, there is the
> side-effect of littering many APIs with the flotsam of lots of different
> classes named "Pair."  My inclination would be to produce one adequate
> Pair class in the JDK to prevent the proliferation of yet more Pair classes
> in other code bases.


Please don't add Pair. It should never be used in APIs. Adding it to
java.util will enable and even encourage its use in APIs. The damage done to
future Java APIs will be far worse than a few duplicate copies of Pair (I
don't even see that many). I think we'll have a hard time finding use cases
to back up this addition.

Bob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/c6b87db7/attachment.html>

From kevinb at google.com  Wed Mar 31 16:14:59 2010
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 31 Mar 2010 09:14:59 -0700
Subject: java.util.Pair
In-Reply-To: <s2qa74683f91003310836z63022a98hce38aefe9c96b166@mail.gmail.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
	<4BB28A9C.10705@oracle.com>
	<s2qa74683f91003310836z63022a98hce38aefe9c96b166@mail.gmail.com>
Message-ID: <l2w108fcdeb1003310914za384c74euf7a544698fc16e88@mail.gmail.com>

On Wed, Mar 31, 2010 at 8:36 AM, Bob Lee <crazybob at crazybob.org> wrote:

Please don't add Pair. It should never be used in APIs. Adding it to
> java.util will enable and even encourage its use in APIs. The damage done to
> future Java APIs will be far worse than a few duplicate copies of Pair (I
> don't even see that many). I think we'll have a hard time finding use cases
> to back up this addition.
>

FYI, here are some examples of types you can look forward to seeing in Java
code near you when you have a Pair class available:

 Pair<List<String>,List<Pair<String,List<Boolean>>>>

 Map<Double,List<Pair<QueryTuple,Map<StatType,Number>>>>

 Map<Locale,Map<String,Pair<DisplayTimeScheme,Pair<String,String>>>>

 FJ.EmitFn<Pair<Long, List<Pair<String, List<Pair<Integer, Integer>>>>>>

 Processor<Pair<List<DiffItem<T>>,Pair<List<T>,List<T>>>,List<DiffItem<T>>>

 DoFn<Pair<String,Collection<Pair<String,Pair<Double,String>>>>,Pair<String,List<Pair<String,Pair<Double,String>>>>>

These are all real examples found in real, live production code (simplified
a little).  There were only a scant few examples of this... caliber... that
did not involve Pair.

The problem is that classes like Pair simply go that much further to indulge
the desire to never have to create any actual types of our own.  When we're
forced to create our own types, we begin to model our data more
appropriately, which I believe leads us to create good abstractions at
broader levels of granularity as well.


-- 
Kevin Bourrillion @ Google
internal:  http://goto/javalibraries
external: http://guava-libraries.googlecode.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/4935c245/attachment.html>

From mr at sun.com  Wed Mar 31 16:25:35 2010
From: mr at sun.com (Mark Reinhold)
Date: Wed, 31 Mar 2010 09:25:35 -0700
Subject: java.util.Pair 
In-Reply-To: kevinb@google.com; Wed, 31 Mar 2010 09:14:59 PDT;
	<l2w108fcdeb1003310914za384c74euf7a544698fc16e88@mail.gmail.com> 
Message-ID: <20100331162535.88F56420@eggemoggin.niobe.net>

> Date: Wed, 31 Mar 2010 09:14:59 -0700
> From: Kevin Bourrillion <kevinb at google.com>

> ...
> 
> The problem is that classes like Pair simply go that much further to indulge
> the desire to never have to create any actual types of our own.  When we're
> forced to create our own types, we begin to model our data more appropriately,
> which I believe leads us to create good abstractions at broader levels of
> granularity as well.

I agree.  Java isn't Lisp.

- Mark


From jjb at google.com  Wed Mar 31 16:40:55 2010
From: jjb at google.com (Joshua Bloch)
Date: Wed, 31 Mar 2010 09:40:55 -0700
Subject: java.util.Pair
In-Reply-To: <20100331162535.88F56420@eggemoggin.niobe.net>
References: <l2w108fcdeb1003310914za384c74euf7a544698fc16e88@mail.gmail.com>
	<20100331162535.88F56420@eggemoggin.niobe.net>
Message-ID: <p2g17b2302a1003310940w3e18f6a3s5fdb5b5639368d09@mail.gmail.com>

Just to add my voice to the chorus, I think adding pair is seductive but
ill-considered.  Based on our experience at Google, I believe it makes a bad
situation worse.  I do believe that Kevin's idea is worth of exploration: in
essence trying to encapsulate all of the knowledge in Chapter 3 of Effective
Java into the language, so that creating a fully-functional value type is as
simple as naming its fields and providing their types.  Of course the devil
is in the details, but this could be a very good thing.

           Josh

On Wed, Mar 31, 2010 at 9:25 AM, Mark Reinhold <mr at sun.com> wrote:

> > Date: Wed, 31 Mar 2010 09:14:59 -0700
> > From: Kevin Bourrillion <kevinb at google.com>
>
> > ...
> >
> > The problem is that classes like Pair simply go that much further to
> indulge
> > the desire to never have to create any actual types of our own.  When
> we're
> > forced to create our own types, we begin to model our data more
> appropriately,
> > which I believe leads us to create good abstractions at broader levels of
> > granularity as well.
>
> I agree.  Java isn't Lisp.
>
> - Mark
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/1311cc9e/attachment.html>

From forax at univ-mlv.fr  Wed Mar 31 17:04:23 2010
From: forax at univ-mlv.fr (=?UTF-8?B?UsOpbWkgRm9yYXg=?=)
Date: Wed, 31 Mar 2010 19:04:23 +0200
Subject: java.util.Pair
In-Reply-To: <z2r108fcdeb1003310834lee0d1795v29030356de52d4bd@mail.gmail.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>	
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>	
	<4BB28A9C.10705@oracle.com> <4BB2FA4A.6090809@univ-mlv.fr>
	<z2r108fcdeb1003310834lee0d1795v29030356de52d4bd@mail.gmail.com>
Message-ID: <4BB38097.5090107@univ-mlv.fr>

Le 31/03/2010 17:34, Kevin Bourrillion a ?crit :
> On Wed, Mar 31, 2010 at 12:31 AM, R?mi Forax <forax at univ-mlv.fr 
> <mailto:forax at univ-mlv.fr>> wrote:
>
>     In equals, instanceof Pair should be instanceof Pair<?,?>.
>     Pair is a raw type.
>
>
> Tangent: there are those of us who believe javac is quite mistaken to 
> issue a warning on 'instanceof Pair'.

you're not the only one but I think you're wrong.

>  (And even if it were right in theory (which I don't think it is), 
> weren't warnings supposed to be things that would warn you about 
> possible /bugs/?)

possible bug:
the semantics of instanceof Foo and instanceof Foo<?> is different if 
generics will be reified.

Example:
class Foo<T> { }

instanceof Foo<?> and instanceof Foo are equivalent.

Now suppose I change the definition of Foo to:
class Foo<T extends Number> { }

I recompile the class Foo and forget to recompile that code:
Foo<?> foobar = new Foo<String>();

foobar instanceof Foo is ok
but foobar instanceof Foo<?> must raised an IncompatibleClassChangeError.

>
>
> -- 
> Kevin Bourrillion @ Google
> internal: http://goto/javalibraries
> external: http://guava-libraries.googlecode.com
>

R?mi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/8a50587a/attachment.html>

From kevinb at google.com  Wed Mar 31 17:20:32 2010
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 31 Mar 2010 10:20:32 -0700
Subject: java.util.Pair
In-Reply-To: <4BB38097.5090107@univ-mlv.fr>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com>
	<4BB28A9C.10705@oracle.com> <4BB2FA4A.6090809@univ-mlv.fr>
	<z2r108fcdeb1003310834lee0d1795v29030356de52d4bd@mail.gmail.com>
	<4BB38097.5090107@univ-mlv.fr>
Message-ID: <w2m108fcdeb1003311020k6381e113td45ed25c89da1d9a@mail.gmail.com>

On Wed, Mar 31, 2010 at 10:04 AM, R?mi Forax <forax at univ-mlv.fr> wrote:

>  (And even if it were right in theory (which I don't think it is), weren't
> warnings supposed to be things that would warn you about possible *bugs*?)
>
> possible bug:
> the semantics of instanceof Foo and instanceof Foo<?> is different if
> generics will be reified.
>

With all due respect, I rest my case. :-)

(Meaning: since you chose such a hypothetical future situation as an
illustration, it suggests that indeed no actual bugs are being prevented
here in the real world.)

We have to recognize the fact that it is no small amount of the world's Java
code that would become broken if generics were ever reified.  And, as well,
that -- no, I won't go on about this, because it's now a tangent of a
tangent.


-- 
Kevin Bourrillion @ Google
internal:  http://goto/javalibraries
external: http://guava-libraries.googlecode.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/f3513cff/attachment.html>

From neal at gafter.com  Wed Mar 31 17:34:23 2010
From: neal at gafter.com (Neal Gafter)
Date: Wed, 31 Mar 2010 10:34:23 -0700
Subject: java.util.Pair
In-Reply-To: <w2m108fcdeb1003311020k6381e113td45ed25c89da1d9a@mail.gmail.com>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM> 
	<108fcdeb1003301054v211788a8tdb28dcb24aa0d2e4@mail.gmail.com> 
	<4BB28A9C.10705@oracle.com> <4BB2FA4A.6090809@univ-mlv.fr> 
	<z2r108fcdeb1003310834lee0d1795v29030356de52d4bd@mail.gmail.com> 
	<4BB38097.5090107@univ-mlv.fr>
	<w2m108fcdeb1003311020k6381e113td45ed25c89da1d9a@mail.gmail.com>
Message-ID: <q2u15e8b9d21003311034tfde83512w4a3f57a6aabb34fe@mail.gmail.com>

On Wed, Mar 31, 2010 at 10:20 AM, Kevin Bourrillion <kevinb at google.com>wrote:

> With all due respect, I rest my case. :-)
>
> (Meaning: since you chose such a hypothetical future situation as an
> illustration, it suggests that indeed no actual bugs are being prevented
> here in the real world.)
>
> We have to recognize the fact that it is no small amount of the world's
> Java code that would become broken if generics were ever reified.  And, as
> well, that -- no, I won't go on about this, because it's now a tangent of a
> tangent.
>

That depends on how the reification is done.  Reification as described in <
http://gafter.blogspot.com/2006/11/reified-generics-for-java.html> would
break no existing code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/8c0833c0/attachment.html>

From i30817 at gmail.com  Wed Mar 31 19:57:38 2010
From: i30817 at gmail.com (Paulo Levi)
Date: Wed, 31 Mar 2010 20:57:38 +0100
Subject: java.util.Pair
Message-ID: <i2y212322091003311257yd16b7169rde9e13b17de891d3@mail.gmail.com>

Please don't add this. I have my own tuple parametric class. In fact it is
easy to do.
http://code.google.com/p/bookjar-utils/source/browse/BookJar-utils/src/util/Tuples.java

However i never use it anymore. It is easy to do & use, but really stupid
since the names (first, second, third...) are so generic... and it has no
behavior.
Invariably i have to replace it by a more domain appropriate class with real
names & methods. A real tuple (where the names don't mater...) might be
usable generally, but not a tuple like normal type.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/eeadc1ae/attachment.html>

From martinrb at google.com  Wed Mar 31 23:06:43 2010
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 31 Mar 2010 16:06:43 -0700
Subject: Wording improvements for String.indexOf, String.lastIndexOf
Message-ID: <s2j1ccfd1c11003311606je42e5119z27f975960c2747f9@mail.gmail.com>

Hi Alan, Xueming,

I'd like you to do a code review.

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/lastIndexOf2/

A colleague suggested wording improvements for String.indexOf and
String.lastIndexOf
At least, this makes the javadoc less gratuitously inconsistent.

Since I'm already coincidentally fixing lastIndexOf, I can either fold
this patch into
lastIndexOf or you can file a separate bug - your choice.

Thanks,

Martin


From kevin.l.stern at gmail.com  Wed Mar 31 23:36:13 2010
From: kevin.l.stern at gmail.com (Kevin L. Stern)
Date: Wed, 31 Mar 2010 18:36:13 -0500
Subject: A List implementation backed by multiple small arrays rather than
	the traditional single large array.
In-Reply-To: <1ccfd1c11003301520g564876fehfce57def62f6d6b3@mail.gmail.com>
References: <1704b7a21003280455u784d4d2ape39a47e2367b79a8@mail.gmail.com>
	<1ccfd1c11003290023u5c59f926o8ceb79fe0d3bbc6f@mail.gmail.com>
	<1704b7a21003300425i7dd1ef7he28728ad3cdb60e2@mail.gmail.com>
	<1ccfd1c11003301520g564876fehfce57def62f6d6b3@mail.gmail.com>
Message-ID: <u2t1704b7a21003311636q35093107l60a1d931d6aa6f66@mail.gmail.com>

What am I missing here?  In "Resizable arrays in optimal time and space" the
authors define their data structure with the following property:

(1)  "When superblock SB_k is fully allocated, it consists of 2^(floor(k/2))
data blocks, each of size 2^(ceil(k/2))."

Since the superblock is zero-based indexed this implies the following
structure:

SB_0: [1]
SB_1: [2]
SB_2: [2][2]
SB_3: [4][4]
SB_4: [4][4][4][4]
[...]

Let's have a look at Algorithm 3, Locate(i), with i = 3:

r = 100 (the binary expansion of i + 1)
k = |r| - 1 = 2
p = 2^k - 1 = 3

What concerns me is their statement that p represents "the number of data
blocks in superblocks prior to SB_k."  There are only two data blocks in
superblocks prior to SB_2, not three.  Given (1) above, unless I'm
misinterpreting it, the number of data blocks in superblocks prior to SB_k
should be:

2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)

This, of course, seems to work out much better in my example above, giving
the correct answer to my interpretation of their data structure, but I have
a hard time believing that this is their mistake rather than my
misinterpretation.

Thoughts?

Kevin

On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz <martinrb at google.com>wrote:

> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
> > Hi Martin,
> >
> > Thanks much for your feedback.  The first approach that comes to mind to
> > implement O(1) time front as well as rear insertion is to create a cyclic
> > list structure with a front/rear pointer - to insert at the front
> requires
> > decrementing the front pointer (modulo the size) and to insert at the
> rear
> > requires incrementing the rear pointer (modulo the size).  We need to
> resize
> > when the two pointers bump into each other.  Could you explain more about
> > your suggestion of introducing an arraylet that is shared by the front
> and
> > the rear?
>
> It was a half-baked idea - I don't know if there's a way to turn it into
> something useful.  I was thinking of the ArrayDeque implementation,
> where all the elements live in a single array.
>
> >  It's not clear to me how that would help and/or be a better
> > approach than the cyclic list.  Anyhow, the paper that you reference,
> > "Resizable arrays in optimal time and space", gives a deque so if we take
> > that approach then the deque is specified.
>
> Technically, ArrayList also supports the Deque operations -
> just not efficiently.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/341f0bd1/attachment.html>

From cadenza at paradise.net.nz  Tue Mar 23 06:08:58 2010
From: cadenza at paradise.net.nz (Bruce Chapman & Barbara Carey)
Date: Tue, 23 Mar 2010 19:08:58 +1300
Subject: Kinda ?
In-Reply-To: <4BA7F97C.5070202@sun.com>
References: <1ccfd1c11003201136u78e159ew88724bfa5a9e28c0@mail.gmail.com>
	<4BA9256A.2020602@sun.com> <4BA7F6EB.6040804@gmx.de>
	<4BA7F97C.5070202@sun.com>
Message-ID: <4BA85AFA.70005@paradise.net.nz>

Paul Hohensee wrote:
> "in a way" plus "somewhat", as in "it's kinda bad" == "in a way, it's 
> somewhat bad".
>
> On 3/22/10 7:02 PM, Ulf Zibis wrote:
>> Can somebody betray the sense of "Kinda" to me?
>>
>> -Ulf
>>
>>
>
a spoken contraction of "kind of" (similar meaning to sorta a 
contraction of sort-of)

nothing to do with children (kinder) although you might sometimes see it 
spelt that way too.

Bruce


From tom.hawtin at oracle.com  Tue Mar 30 11:15:37 2010
From: tom.hawtin at oracle.com (tom.hawtin at oracle.com)
Date: Tue, 30 Mar 2010 12:15:37 +0100
Subject: java.util.Pair
In-Reply-To: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
References: <EDA0E349-DBE4-45DA-BF05-83C62A786116@Sun.COM>
Message-ID: <4BB1DD59.6000007@oracle.com>

On 30/03/2010 09:08, Weijun Wang wrote:

 > I know such a simple thing can be made very complex and everyone might
 > want to add a new method into it. How about we just make it most
 > primitive? Simply an immutable and Serializable class, two final
 > fields, one constructor, two getters (?), and no static factory
 > methods.

Even with the diamond operator, I'd prefer a static creation method to a 
constructor. Immutable value classes really should not have 
constructors. I'd also like to support Comparable for Comparables.

 >          (S)he who does the real implementation has the privilege to
 > choose between head/tail and car/cdr.

Or are you suggesting an abstract base class to support two-field 
immutables? IMO, a good idea from a strong-typing perspective, but lazy 
programmers will probably want a concrete pair or they'll keep 
implementing their own.

Tom