Sponsor for 6666666: A better implementation of Character.isSupplementaryCodePoint

Thu Mar 25 23:33:44 UTC 2010

On Thu, Mar 25, 2010 at 13:26, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 24.03.2010 20:34, schrieb Martin Buchholz:
>>
>> On Wed, Mar 24, 2010 at 10:20, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>>
>>>
>>> Am 23.03.2010 23:59, schrieb Martin Buchholz:
>>>
>>
>>
>>>
>>> I too would like to see 8 spaces indentation on line breaks like:
>>>    if (aaaaaaaaaaaaaaa>  bbbbbbbbbbbbb&&
>>>            ccccccccccccccc>  ddddddddddddddddd)
>>>        doSomething();
>>>
>>
>> This appears to be a new style (perhaps coming from the java IDEs?)
>>
>
> This rule is much older:
> http://java.sun.com/docs/codeconv/html/CodeConventions.doc3.html#248
> But yes, I first saw this from NetBeans IDE formatting facility.

Ahhh, thank you very much for this history lesson.

I have manually adjusted some source files as you requested,
but systematically fixing this particular coding style bug
is likely to be difficult.

>>>
>>> +
>>>     * @see    #forDigit(int, int)
>>>     * @see    Integer#toString(int, int)
>>> instead:
>>>     * @see     java.lang.Character#forDigit(int, int)
>>>     * @see     java.lang.Integer#toString(int, int)
>>>
>>
>> I did a global s/java\.lang\.// in Character.java.
>>
>
> As justified before, I would drop the current classes name.
> See: http://java.sun.com/j2se/javadoc/writingdoccomments/index.html#tag

For this particular source file,
I am going to mildly disagree with you,
and keep as is.

>>>         * range: U+DC00 through U+DFFF
>>> instead
>>>         * range: 0xDC00 through 0xDFFF
>>>
>>
>> I disagree.  The U+ notation should be reserved for
>> Unicode characters (code points) and not UTF-16
>> code units (which surrogates are).
>>
>
> I fully agree, but in the context, where I wanted to change this, the matter
> actually was about code points, not code units, and ...
> in case of Java char/UTF-16 code units, IMO we should use \u notation.
> 0x notation should only be used for none Unicode charsets binary values.

Oh, I see.  You are right.  Patch coming up.

> BTW, I can't find any docu about {@linkplain ...}.
> What is the advantage against simple {@link ...}?

http://java.sun.com/j2se/1.4.2/docs/tooldocs/javadoc/whatsnew-1.4.html

> Additionally I like to mention for class Character:
> - numerous javadoc blocks are only indented by 3 instead 4 spaces.

Addressed in one of my current patches.

> - several UnicodeBlock declarations differ little in indentation/whitespace
> usage from the average. I would prefer:
>        public static final UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_A =
>            new UnicodeBlock("SUPPLEMENTARY_PRIVATE_USE_AREA_A",
>                             new String[] { "Supplementary Private Use
> Area-A",
>                                            "SupplementaryPrivateUseArea-A"
> });

See forthcoming patch.