8000685: (props) Properties.storeToXML/loadFromXML should only require UTF-8 and UTF-16 to be supported

Alan Bateman Alan.Bateman at oracle.com
Fri Oct 12 10:19:29 UTC 2012


A few days ago we added a JDK private provider interface [1] to which 
the Properties loadFromXML and storeToXML methods will delegate. The 
motive as I mentioned is to allow for a smaller environments where JAXP 
might not be present (so motivated by both modules and the compact 
profiles work).

The next step in this effort is dealing with the issue of arbitrary 
encodings. The storeToXML method allows the encoding to be specified, 
the loadFromXML method assumes that the implementation can decode the 
stream and read the encoding declaration. The specification doesn't make 
it clear how either method behaves with unrecognized encodings and this 
is something that we need to fix in order to allow for alternative 
implementations, in particular tiny parsers that might not support more 
than a few.

The webrev the proposed changes is here:

http://cr.openjdk.java.net/~alanb/8000685/webrev/

The proposal is that an implementation minimally supports UTF-8 and 
UTF-16, which I think is consistent with the W3C XML specification [2].

Based on a search of a large number of projects then it appears that 
these methods aren't used very much so I don't think this will have any 
significant impact. In addition the same set of encodings [ which is not 
exactly the same set as Charsets.availableCharsets().keySet() ] that 
works today will continue to work when the service provider that uses 
JAXP is installed.

In addition, to specifying the required encodings, I have also changed 
both methods to specify that UnsupportedEncodingException may be thrown. 
In the case of loadFromXML then this is the long standing behavior 
anyway. In the case of storeToXML then the long standing behavior is 
somewhat bizarre. If the method is invoked with an unsupported encoding 
then the underlying Xalan code prints a warning to System.out and 
changes the encoding under the covers to UTF-8. I've submitted a bug on 
this oddity; in the mean-time I've added a check in the platform 
provider to always fail for charsets that aren't recognized.

-Alan.

[1] http://hg.openjdk.java.net/jdk8/tl/jdk/rev/f65871e75fde
[2] http://www.w3.org/TR/REC-xml/#charencoding



More information about the core-libs-dev mailing list