Unicode script support in Regex and Character class

Ulf Zibis Ulf.Zibis at gmx.de
Mon Apr 26 11:22:21 UTC 2010


Am 24.04.2010 01:09, schrieb Xueming Shen:
> Ulf Zibis wrote:
>>
>> - I like the idea, saving the data in a compressed binary file, 
>> instead classfile static data.
>> - wouldn't PreHashMaps be faster initialized as a normal HashMaps in 
>> j.l.Character.UnicodeScript and j.l.CharacterName?
> I don't think so. The key for these 2 cases is the whole unicode 
> range. But you can always try. Current
> binary-search for script definitely is not a perfect solution.

I think, the aliases map is perfectly predestined for PreHashMap.

Anyway I more would like (just syntax sugar + initial capacity):
-        private static HashMap<String, Character.UnicodeScript> aliases;
+        private final static HashMap<String, Character.UnicodeScript> 
aliases = new HashMap<>(values().length);
         static {
-            aliases = new HashMap<String, UnicodeScript>();
             aliases.put("ARAB", ARABIC);
             ...

or (see project coin):
     private final static HashMap<String, Character.UnicodeScript> 
aliases = {
         "ARAB" : ARABIC,
         ...,
     }


-Ulf





More information about the core-libs-dev mailing list