[9] RFC JDK-8068373: (prefs) FileSystemPreferences writes \0 to XML storage, causing loss of all preferences
Paul Sandoz
paul.sandoz at oracle.com
Thu Feb 12 17:29:12 UTC 2015
On Feb 12, 2015, at 5:27 PM, Brian Burkhalter <brian.burkhalter at oracle.com> wrote:
>
> On Feb 12, 2015, at 8:18 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>
>>> This is a morass and I hope that someone more apt to know it well would comment. The U+0000 null control character is always illegal though I do know that.
>>
>> Yes. IIRC XML 1.1 basically allows any character except U+0000.
>
> "Note that the code point U+0000, assigned to the null control character, is the only character encoded in Unicode and ISO/IEC 10646 that is always invalid in any XML 1.0 and 1.1 document."
>
> http://en.wikipedia.org/wiki/Valid_characters_in_XML#Characters_allowed_but_discouraged
>
>>>
>>> The problem is that on OSX and Windows prefs are not stored to XML
>>
>> What are they stored in? name/value pairs?
>
> Yes. On OSX I think Property List (.plist) files; on Windows I do not know yet.
>
>>> whereas on Unix they are.
>>
>> Is that a specification requirement?
>
> No, I think it’s an artifact of no inherent DB-like facility in the system.
>
>>> That would make it an error to add such a value to the prefs on some platforms but not on others.
>>
>> Yes, for an interoperable format potentially read by other tools having U+0000 is a really bad idea.
>
> Yep.
>
And i think that applies to plist files too.
>> My inclination is if properties are written out to a text file then it should fail if a key/value contains U+0000 (Binary data should be base64 encoded in such cases.) Replacing just subtlety hides or defers the issue.
>
> That was my original idea \in fact (webrev.00, unpublished). It would however require a spec update to
>
> http://docs.oracle.com/javase/8/docs/api/java/util/prefs/Preferences.html#put-java.lang.String-java.lang.String-
>
> to allow for an IAE in this case. This would also be a backward-incompatible change for platforms which do allow storing such values.
>
> Note that a similar situation applies to Properties.
>
My recommendation is serialization of properties to any textual format should barf if a U+0000 is encountered. Otherwise it's just hiding bugs. In such cases i think there is a strong justification to introduce such an incompatible change. I except it is rare to encounter in practice.
Paul.
More information about the core-libs-dev
mailing list