RFR: 8065138 - Encodings.isRecognizedEnconding sometimes fails to recognize 'UTF8'
Daniel Fuchs
daniel.fuchs at oracle.com
Wed Nov 19 18:15:12 UTC 2014
On 19/11/14 18:01, Martin Buchholz wrote:
> On Wed, Nov 19, 2014 at 3:17 AM, Daniel Fuchs <daniel.fuchs at oracle.com> wrote:
>> Hi,
>>
>> Please find below a trivial fix for
>>
>> 8065138: Encodings.isRecognizedEnconding sometimes fails to
>> recognize 'UTF8'
>> https://bugs.openjdk.java.net/browse/JDK-8065138
>>
>> webrev: http://cr.openjdk.java.net/~dfuchs/webrev_8065138/webrev.00/
>>
>> The root of the issue is with
>> jaxp/src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties
>> It contains a special character 'å' which confuses the build
>> system on Mavericks.
>
> Isn't that a bug in the build system that really ought to be fixed?
>
> If properties files are to be stored as resources in jar files, they
> should either be incorporated byte-for-byte identical, or they should
> be decoded using ISO-8859-1 (as specified). It may be best to leave
> non-ASCII characters in the source files, as a "test" of the build
> system and the jdk itself.
Hmmm. If the character is indeed legal then you're right, fixing
the build is probably a better idea.
However the issue seems to be with using 'sed' over property files:
If I simply do:
cat
jaxp/src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties
| sed 's,x,x,g'
on my machine, it balks with:
sed: RE error: illegal byte sequence
-- daniel
>
>> The Encodings.properties file ends up truncated in resources.jar - it
>> contains only one line (the line before the special character was
>> encountered).
>> The fix is to replace the special character 'å' by its unicode
>> representation \u00e5.
>>
>> best regards,
>>
>> -- daniel
>>
More information about the core-libs-dev
mailing list